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This work is a tribute 
To all my teachers 
The ones I met and the ones I did not 


Dedication 
To all my children 


Preface 


This is a beginning graduate book on real and functional analysis, with a significant 
component on topology. The prerequisites include a solid understanding of 
undergraduate real analysis and linear algebra, and a good degree of mathematical 
maturity. Rudimentary knowledge of metric spaces, although not required, is a 
huge asset. With the singular exception of Liouville’s theorem (stated without 
proof), and a passing reference to Laurent series, knowledge of complex analysis 
is neither assumed nor needed. 


It is possible for students with high mathematical aptitude to study this book 
independently. However, the book is designed as a textbook for well-prepared 
students of mathematics, to be taught under the able guidance of an instructor. 
I like to think of this book as an accessible classical introduction to the subject. 
The goal is to provide a springboard from which students can dive into greater 
depths in the sea of mathematics. 


The book is neither encyclopedic nor a shallow introduction. The aim is to achieve 
excellent breadth and depth. The topics are organized logically but not rigidly, in 
order to maximize utility and the potential readership. The careful sequencing 
of the sections is designed to allow instructors to select topics that suit their 
course goals, student backgrounds, and time limitations. Although the proofs are 
detailed, I hope the reader will find the writing style clear and concise. The section 
exercises constitute an important complement to the results in the main body of 
the section. Indeed, some of the exercises provide alternative approaches to some 
topics, and generalizations of some of the results in the main text are considered in 
the exercises. The book synopsis included after the preface furnishes more details 
on the structure of the book and brief chapter descriptions. 


I deliberately avoided making specific bibliographic citations within the body of 
the text. There are two main reasons for this. First, all the results in this book are 
well established and can be found in multiple sources. Second, the book contains 
no original results. Therefore, the lack of bibliographic citations or the absence 
of any specific source must not be conflated with claims of originality. I did not 
number the definitions, in order to prevent item numbers within sections from 
escalating to an annoying level. Definitions are seldom referenced far from where 
they first appear, and the extensive index and the glossary of symbols should help 
the reader locate items easily. Examples are locally and manually numbered within 
each section. 
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Almost all of the historical information contained in this book is abridged, 
with large excerpts included without quotation marks, from J. J. O'Connor and 
E. FE Robertson's articles in the MacTutor History of Mathematics archive, School of 
Mathematics and Statistics, University of St Andrews, Scotland (see http://www- 
history.mcs.st-andrews.ac.uk/index.html). 


Sir Isaac Newton once said that if he had seen further than others, it was by 
standing on the shoulders of giants. I am no giant, but this book is the shoulder I 
have to offer. Perhaps a few students will climb and will be able to see farther than 
T have. 
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The Book in Synopsis 


The book in its entirety contains enough material for a two-semester course. The 
core of the book can be used for an easy paced two-semester course. If a definition 
of the core contents of the book is desirable, I define the core to consist of the 
following sections, in addition to the very basic ideas in sections 1.1, 1.2, and 3.1: 


p> Sections 2.1 and 2.2 

p> Sections 3.2-3.4, 3.6, and 3.7 
p> Sections 4.1-4.10 

p> Sections 5.1-5.4 and 5.6-5.8 
p> Sections 6.1-6.4 

p> Sections 7.1 and 7.2 

p Sections 8.1-8.4 


Part I. Background Material 


Instructors can choose material from this part as their students’ background 
warrants. The most basic results in the first three chapters are stated without proof. 


Chapter 1. This chapter furnishes a brief refresher of basic concepts. The natural, 
rational, and real number systems are taken for granted, although we develop the 
completeness of the real line and the Bolzano-Weierstrass theorem at length, 
as well as the complex number field, including its completeness. Embryonic 
manifestations of completeness and compactness can be seen in this chapter. 
Examples include the nested interval theorem and the uniform continuity of 
continuous functions on compact intervals, and our proof of the Heine-Borel 
theorem in chapter 4 is squarely based on the Bolzano-Weierstrass property of 
bounded sets. 


Chapter 2. This chapter fills in any potential gaps that may exist in the 
student’s knowledge of set theory. Sections 2.1 and 2.2 are essential for a proper 
understanding of the rest of the book. In particular, a thorough understanding 
of countability and Zorn’s lemma is indispensable. Some of section 2.3 may be 
included, but only an intuitive understanding of cardinal numbers is sufficient. 
Studying section 2.3 up to theorem 2.3.4, together with theorem 2.3.13, is sufficient 
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to follow the discussion on the existence of a vector space of arbitrary (infinite) 
dimension, and the existence of inseparable Hilbert spaces. Cardinal arithmetic 
can be omitted. Indeed, the results on cardinal arithmetic are applied only once in 
order to prove the invariance of the cardinality of a linear basis of a vector space. 
Ordinal numbers have been carefully avoided. 


Chapter 3. It is this author’s observation that the undergraduate linear algebra 
curriculum has settled into a matrix theory mode without enough exposure to 
vector space theory. This chapter aims to provide a solid but brief account of the 
theory of vector spaces. The reader is assumed to have good knowledge of the 
basic definitions, which are briefly summarized in section 3.1. The aim of sections 
3.2 and 3.3 is to provide a thorough presentation of the concepts of basis and 
dimension, especially for infinite-dimensional vector spaces, as these are topics 
that are not normally developed rigorously in the undergraduate curriculum. The 
approach is unified in the sense that we do not treat finite and infinite-dimensional 
spaces separately. Important concepts make their first debut in section 3.4. These 
include algebraic complements, quotient spaces, direct sums, projections, linear 
functionals, and invariant subspaces. Section 3.5 provides a brief refresher of 
matrix representations and diagonalization. Section 3.6 introduces normed linear 
spaces and is followed by an extensive study of inner product spaces in section 
3.7. The presentation of inner product spaces in this section and in section 4.10 
is not limited to finite-dimensional spaces but rather to many of the properties of 
inner products that do not require completeness. The chapter concludes with the 
finite-dimensional spectral theory. 


Part II. Topology 


A respectable one-semester course on topology can be based on chapters 4 and 5. 
It is my belief that an adequate mastery of the basics of topology is a necessary 
prerequisite for an organized study of higher mathematics. This is a focal point of 
the book philosophy. It is fair to say that the book, generally speaking, has a mild 
topological flavor. Chapters 4 and 5 provide a solid launch pad into the last three 
chapters of the book. It is possible for the instructor, with a moderate amount of 
maneuvering, to navigate most of the rest of book while avoiding chapter 5. This 
chapter, however, contributes richly to the depth of the book. 


Chapter 4. This chapter provides an extensive account of the metric topology 
and is a prerequisite for all the subsequent chapters. The leading sections furnish 
basic concepts such as closure, continuity, separation properties, product spaces, 
and countability axioms. This is followed by a detailed study of completeness, 
compactness, and function spaces. Chapter applications include contraction 
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mappings, nowhere differentiable functions, and space-filling curves. The chapter 
concludes with a detailed section on Fourier series and orthogonal polynomials, 
which, together with section 3.7, provides an excellent background for Hilbert 
spaces. Our study of sequence and function spaces in this chapter leads up gently 
into Banach spaces. 


Chapter 5. This chapter emphasizes the nonmetric properties of topology. 
Sections 5.1-5.8 constitute the core of the chapter. Section 5.5 is terminal and 
may be omitted. The remaining sections are more advanced and can be omitted. 
Section 5.9 (locally compact spaces) is the transitional section between the core 
of the chapter and the last three sections. At various points in the book, I point 
out how results stated for the metric case can be extended to topological spaces, 
especially locally compact spaces. Some such results are developed in the exercises. 
Sections 5.10-5.12 are optional, and little subsequent material is based on them. I 
provided a specialized proof of Urysohn’s lemma for R” in section 8.4 in order to 
help instructors avoid section 5.11, if they so choose. Tychonoff’s theorem appears 
twice: once in section 5.8, for the product of finitely many topological spaces, 
and again in section 5.12, for the product of infinitely many spaces. The proofs 
are different, and both are worthy of inclusion, if an instructor decides to include 
section 5.12. 


Part III. Functional Analysis 


An introductory course on functional analysis can be based on the instructor's 
choice of the background material and chapters 4, 6, and 7. 


Chapter 6. This chapter introduces Banach spaces. Sections 6.1-6.4 form the 
core of the chapter. It would be accurate to characterize sections 6.1-6.4 as quite 
classical. Section 6.5 is needed for sections 7.3 and 7.4. Section 6.6 can be omitted 
if a brief introduction is the goal. In this case, section 7.5 must also be omitted. 
Section 6.7 is terminal and may be omitted without consequence. I have enriched 
the chapter by including such topics as Gelfand’s theorem, Schauder bases, and 
complemented subspaces. Chapters 6 and 7 include a good number of applications 
of the four fundamental theorems of functional analysis. 


Chapter 7. This chapter introduces Hilbert spaces and the elements of operator 
theory. Sections 7.3 and 7.4 contain a good set of results on self-adjoint and 
compact operators. The section exercises contain problems that suggest alternative 
approaches and hence allow the instructor to shorten these two sections while 
preserving good depth. For example, the Fredholm theory can be bypassed if 
the instructor wishes to limit the discussion to compact, self-adjoint operators on 
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Hilbert spaces. Sections 7.3 and 7.4 are written in such a way to facilitate extending 
the results to compact operators on Banach spaces (section 7.5). For example, we 
used Riesz’s lemma instead of the projection theorem in order to keep the proofs 
adaptable for extension to Banach spaces. Sections 7.3-7.5 contain more results 
than are typically found in an introductory course. 


Part IV. Integration Theory 


Together with chapter 4, this chapter constitutes the general/real analysis 
component of the book, and a good course on real analysis can be built on the 
background material and those two chapters. 


Chapter 8. Section 8.1 furnishes a brief but rigorous introduction to the Riemann 
integral of continuous functions on compact boxes in R”. Although it has intrinsic 
value, the section is included for the express purpose of developing section 8.4. 
Section 8.4 develops the Lebesgue measure on R”, and the approach is to extend 
the positive linear functional provided by the Riemann integral on the space 
of continuous, compactly supported functions on R”. This very nearly amounts 
to developing the Radon measure theory on locally compact Hausdorff spaces. 
However, I chose to limit the discussion to Lebesgue measure on R” because I did 
not wish to base the presentation heavily on chapter 5. I did, nonetheless, include 
an excursion into Radon measures as an optional topic. The rest of the chapter is 
largely independent of sections 8.1 and 8.4 and constitutes a decent introduction 
to general measure and integration theories. The section on complex measures has 
intrinsic value but is also included in order to facilitate the study of the duals of 2° 
spaces. In particular, I limited the discussion of signed measures to real measures, 
this is, signed measures that are not allowed to assume infinite values. This turned 
out to be sufficient for our purposes. The selection of topics and the approach in 
sections 8.6 and 8.8 are quite classical and cover the basics of 2? spaces and product 
measures. Section 8.7 contains an excellent collection of approximations theorems, 
including approximations by C® functions. The title of the last section accurately 
captures its contents: a mere glimpse of the subject. However, the section finally 
settles questions started in sections 3.7 and 4.10 and concludes with the unraveling 
of the mystery about the completeness of orthogonal polynomials. 
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Appendices 


Appendix A. This appendix contains the proof of the equivalence of the axiom 
of choice, Zorn’s lemma, and the well-ordering principle. I created this appendix 
in order to avoid distraction if instructors decide not to include the proof in their 
course. 


Appendix B. This appendix is rather elementary in nature. It develops matrix 
factorizations and is used for deriving the change of variables formula in the 
exercises on section 8.8. Reference to this appendix is also made in section 3.5. 
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Preliminaries 


We are justified in calling numbers a free creation of the human mind. 
Richard Dedekind 
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Richard Dedekind. 1831-1916 
In 1848, at the age of 16, Dedekind entered The Collegium Carolinum, an 
educational institution between a high school and a university. He then attended 
the University of Gottingen in 1850, and in 1852 completed his doctoral work in 
four semesters under Gauss’s supervision. Dedekind was to be Gauss’s last pupil. 
Dedekind spent the following two years in Berlin for further training, returning 
to Gottingen in 1855, the year Gauss died. 


Dirichlet was appointed to fill Gauss’s chair at Géttingen, soon became Dedekind’s 
friend and mentor, and had a strong influence in shaping his mathematical 
interests. While at Gottingen, Dedekind studied the work of Galois and was the 
first to lecture on Galois theory. 


Dedekind was later appointed to the Polytechnic of Ziirich and began teaching 
there in 1858. By the 1860s, The Collegium Carolinum in Brunswick had been 
upgraded to the Brunswick Polytechnic, and Dedekind was appointed to it in 


Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules. 
DOI: 10.1093/0s0/97801 98868781 .003.0001 


2 FUNDAMENTALS OF MATHEMATICAL ANALYSIS 


1862. With this appointment he returned to his hometown and remained there 
for the rest of his life. 


Dedekind made a number of highly significant contributions to mathematics, 
including his definition of finite and infinite sets, and his construction of the real 
numbers as cuts in the set of rational numbers. Dedekind’s definitions are accepted 
today as the standard definitions. 


Among Dedekind’s other notable contributions to mathematics were his editions 
of the collected works of Dirichlet, Gauss, and Riemann. His study of Dirichlet’s 
work led him to study algebraic number fields, where he realized the importance 
of rings and ideals. The general term ring did not appear in Dedekind’s work; it was 
introduced later by David Hilbert, and Dedekind’s notion of an ideal was taken up 
and extended by Hilbert and then later by Emmy Noether. 


Dedekind retired in 1894. His life was long, healthy, and contented. He never 
married and instead lived with one of his sisters, who also remained unmarried, 
for most of his adult life. “He did not feel pressed to have a more marked effect in 
the outside world: such confirmation of himself was unnecessary.”* 


“Dedekind’s legacy ... consisted not only of important theorems, examples, and 
concepts, but a whole style of mathematics that has been an inspiration to each 
succeeding generation.” 


1.1 Sets, Functions, and Relations 


The reader is expected to be familiar with basic set theoretic concepts such 
as containment, unions, and intersections and should be comfortable with set 
notation. Most of the essential definitions will be stated in this section. A number 
of basic facts will be stated as theorems, without proof. 


We use the symbols N, Z, Q, R, and C to denote, respectively, the natural numbers, 
the integers, rational numbers, real numbers, and complex numbers. The symbol 
©@ denotes the empty set. 


* J. J. O'Connor and E. E Robertson, “Julius Wilhelm Richard Dedekind, in MacTutor History of 
Mathematics, (St Andrews: University of St Andrews, 1998), http://mathshistory.st-andrews. ac.uk/ 
Biographies/Dedekind/, accessed Oct. 31, 2020. 

? O'Connor and Robertson, “Julius Wilhelm Richard Dedekind.” 
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Notation. If X is a suitable universal set for a particular problem and A C X, we 
use the notation X — A to denote the complement of A in X: 


X-A={xEX: x Al. 
We use the same notation for relative differences (the complement of B in A): 
A-B={xEA:xEBY=AN(X-B). 


Theorem 1.1.1 (distributive laws). Let A and B,,B),...,B, be subsets of a set X. 
Then 


(a) A U (jx B;) = NLA U B;), 
(b) AN (Ujz1B;) = UL (AN B;). 


Theorem 1.1.2 (De Morgan’s laws). Let A,,A,...,A,, be subsets of a set X. Then 


(a) X-U'L, A; = NL (X-A)), 
(b) X— NL, A; = Ui X—A)). 


Definition. Ifx and y are objects (e.g., numbers, functions, sets), the ordered pair 
(x,y) is defined by (x,y) = {x, {x, y}}. The reader can verify that the definition 
guarantees that (x,y) = (a,b) if and only if x = a, and y = b. 


Definition. Let X and Y be nonempty sets. The Cartesian product of X and Y is 
the set of all ordered pairs: 


XXY={(~%,y) 1 xEX ye YX. 


Definitions. Let X and Y be nonempty sets. A function f from X to Y is a subset 
of X x Y such that for any x € X, there is a unique y € Y such that (x,y) € f/ We 
use the more common notation y = f(x) instead of the cumbersome (x,y) Ef. 
We use the notation f : X > Y to indicate that fis a function from X to Y; X is 
called the domain of f, denoted Dom(f), and the range of f, denoted R(f), is 
the set of all function values, 


Rf) = (fl) : x € Xp. 


If A CX, the image of A under f is the set f(A) = { f(a) : a € A}. The inverse 
image of a set B C Y is the set f-!(B) = {x € X : f(x) € B}. 
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Definitions. A function f: X — Y is onto (or surjective) if R(f) = Y. Thus, 
every y € Y is the image of some x € X. A function f is called one-to-one (or 
injective) if, for x,,x. © X,x, #xX, implies f(x,) #f(x2). Finally, f is called a 
one-to-one correspondence (or a bijection) if f is one-to-one and onto. 


Definition. The identity function I, on a set X is the function I(x) = x for all 
xEXx. 


Definition. Let f: X— Y and g: YZ. The composition of f and g is the 
function gof : X > Z defined by (gof)(x) = g( f(x). 
We sometimes use the notation gf if there is no danger of confusing the 
composition of fand g with the product of fand g. 


Definition. Let f: X > Yandg : Y— Xbe functions. We say that g is the inverse 
of f if gof = Iy and fog = Iy. 
We write g = f~! to indicate that g is the inverse of f. Notice that a function f 
has an inverse if and only if it is bijective. Also, if g= f~', then f= g7’. 


Definition. A finite sequence in a set A is a function a: {1,2,...,n}— A. The 
element a(i) is often denoted by a;. It is sometimes the case that a distinction 
must be made between the sequence (as a function) and its range (as a set). We 
denote a sequence by the notation (a;)/_), and its range by {a), ay,...,a,,}. An 
infinite sequence in A in is a function a : N > A. An infinite sequence is often 
given the notation (a,,). 


Indexed Sets 


Let I be a set (the indexing set) and let 2 be a collection of sets. An indexing of 
Yl by Lis a bijection A : I> Y. The image of an element a € I is denoted by Ag 
instead of A(a). Thus 2% = {A, : a € I}. Indexing is, of course, not limited to sets; 
one can index, for example, a set of numbers, or functions. If there is no danger 
of ambiguity, we sometimes omit reference to the indexing set I and write {Ag}y. 
Indexing is clearly a generalization of sequencing, as illustrated by the examples 
below. 


Example 1. For each n € Z, let A, =(n,n+ 1). This is a collection of intervals 
indexed by Z. 


Example 2. Let = R and, for a EI, let By = (a, 00). 
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Example 3. We can index the set of linear homogeneous functions in one real 
variable as { fz : a € R}, where, for x € R, f(x) = ax. @ 


Definition. Let 20= {Ag : a € I} be an indexed family of subsets of a set X. We 
define the union and intersection of 2 as follows: 


UgerAg ={x EX : x € Ag for some a € I}, 
Neetdg ={x EX: xE Ag for alla € ff. 


Example 4. In example 1, U,<zA, = R-—Z,NyezvA, = ©. 4 
Example 5. In example 2, UgerBy = R,NgerBg =O. 


Definition. A family of sets {Ag }q is said to be disjoint if A, 9 Ag = © whenever 
at fp. 


For example, the family {A,,} in example 1 is a disjoint family. 
The following theorem will be used frequently in this book. 


Theorem 1.1.3. Let (A,,) be a sequence of subsets of a given set X. Then 
(a) There exists a sequence of sets (B,,) such that B, C B,C... and UR_,A, = 
Ur 1B, We simply define B, = Ui, Aj. 
(b) There exists a disjoint sequence of sets (C,,) such that U7, A, = U2, C,,. The 


sequence we seek is C, = A, and, forn > 2, C, =A, — U/L; A;. 


Definition. A sequence (B,,) of sets is said to be ascending if B} C B, C.... 
A sequence (B,,) of sets is said to be descending if B, D B, D.... 


The following two theorems generalize theorems 1.1.1 and 1.1.2. 


Theorem 1.1.4 (distributive laws). Let {By}, be an indexed family of subsets of a 
set X, and let A be a subset of X. Then 


(a) AU(NgBg) = Ngl(A U By), 
(b) AN(UgBy) =Ug(ANB,). i 


Theorem 1.1.5 (De Morgan’s laws). Let {Ag}q be an indexed family of subsets of a 
set X. Then 
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(a) X—NgAg =Ug(K— Ag), 
(b) X —UgAg =Ng(X — Aq). 


Theorem 1.1.6. Letf: X — Y, let {Aq}q be a collection of subsets of X, and let {Bg}g 
be a collection of subsets of Y. Then 


(a) fUgAa) = Vaf(Aa), 

(b) f(NgAa) © Nef(Aa), 

(c) f- (UgBg) = Ug f '(Bg), 
(d) f-(NgBg) =Ngf '(Bg). 


Definition (Cartesian products). Let {Xg}vey be a nonempty collection of 
nonempty sets. The product [J , erXa is the collection of all functions 


x21 > Uge Xe 
such that x(a) € X, for all a € I. We write x, for x(a). 


We will denote the function x in the above definition by (xg) ge, or simply (xg). 
The above definition generalizes the definition of the Cartesian product of a 
finite number of sets. Indeed, for sets X,, X>,...,X;,,, the Cartesian product | er X; 
is the set of all sequences (x1,X3,...,x,) such that x; € X; for all l<i<n.A 
sequence is nothing but a function x : {1,2,...,n} + UL,X; such that x; = x(i) € 
X; for alll <i<n. 


Example 6. R” = Rx...xR (n factors) is the Euclidean n-space. The complex 
n-space C” is defined similarly. @ 


Example 7. Let R‘ be the set of all infinite sequences in R. This is also the product 
IL... Xi where each X;= R. @ 


Example 8. Let A be a set, and let 24 denote the set of all functions from A to 
the set {0,1}. Indeed, 24 is a product because if we define X,, = {0,1} for all 
a € A, then 24 =]]T__, Xq. Asa special case, the set 2‘ is the set of all binary 
sequences. @ 


acA 


Definition (set exponentiation). Let A and B be nonempty sets. Define A? to be 
the set of all functions f : B > A. We leave it to the reader to interpret A? as a 
product. 


PRELIMINARIES 7 


Definition. Let A be a nonempty set. The collection of all the subsets of A, 
including the empty set, is known as the power set of A and is denoted 
by P(A). 


Definition. For a subset S of a set A, we define the characteristic function of S by 


(x) 1 ifxes, 
x)= 
as 0 ifx€S. 


Clearly, v7; € 24. Moreover, the correspondence y : P(A) > 24 that assigns to 
each element S € P(A) (ie., SC A) its characteristic function 7, is a bijection 
from P(A) to 24. We leave it to the reader to verify the details. 


Definition. Let {Xz}, be a collection of sets, and let X= ]], erka- For each 
a €I, define the projection 77, : X > Xq by 7_(x) = xq. Here x = (Xq@)aer is 


an element of X. 


Example 9. Let X = R”. Then, z, : R” > R is indeed what we think of as the 
projection of R” onto the x,-axis: 7, (x1,...,x,) =X). @ 


Example 10. Consider the set D of all functions f: [0,1] > R. This set can be 

thought of as Tae ; aXe where each X, =R. If f€ D and a € [0,1], then 

Ta(f) = f(a). Fix an element a € [0,1] and an interval UC R. It makes sense 

to ask what 27'(U) is. This is simply the set of all functions f € D such that 

m,(f) € U or simply f(a) € U. Thus 27 !(U) is the set of all the functions on 
the closed unit interval whose graphs cross the line segment {a} x U. 


Definition. A relation R on a set A is a subset of A x A. Thus R is a set of ordered 
pairs (x,y), where x,y € A. Instead of writing (x, y) € R, we write xRy. If xRy, 
we say that x is related to y. 


Definition. A relation R on a set A is said to be 


(a) reflexive if, for all x € A, xRx; 

(b) symmetric if yRx whenever xRy; 

(c) transitive if xRy and yRz imply xRz; and 

(d) an equivalence relation if it is reflexive, symmetric, and transitive. 


Definition. Let R be an equivalence relation on a set A, and let x € A. The 
equivalence class of x, denoted [x], is [x] ={y © A : yRx}. 
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Theorem 1.1.7. Let R be an equivalence relation on a set A, and let x,y € A. Then 


(a) [x] = Ly] if and only if xRy. 
(b) If Lx] # Ly), then [x] ny] =. 


Thus the union of the equivalence classes is A, and distinct equivalence classes are 
disjoint. The common terminology is that the equivalence classes partition A. 


Exercises 


In the exercises below, A,A,,,.B,C, and so on are subsets of a nonempty set X. 


1. 


Prove that AN (B—C) = (ANB)—-(ANC). 
Is it true that A U(B— C) = (AUB) —(AUC)? 


2. Find Unenl= 57], Naeto,yla—1,¢+1], and Ugeo,yla—1a+ 1]. 
3. (a) Show that A C B if and only if P(A) € PCB). 


11. 


(b) Show that P(A) U P(B) € P(A UB). 
(c) Show that P(A) N P(B) = P(ANB). 


. For rER,r>0, let B,={(x,y) ER? :x°+y? <r} Find n,59B, and 


Urs0B;- 


. Describe the following sets in words: URL, NZ, Ay and NF, UZ, Ag 
. Let {Ag}g be an indexed family of sets, and let B be a set. Show that 


(a) (UgAg) X B= Ug (Ag X B), and 
(b) (N¢Aa) X B=NglAg X B). 


. LetA={xER: |x| <1}, B={x ER : |x| > 1}. Give a geometric interpre- 


tation of AX B. 


. Consider the product J], era Of a collection of sets. Suppose that 


{1,%z,...,%,} is a finite subset of Iand that Ug, CX_,,1<i<n. 
Describe the set Nj, 7g,'(Ug,). 


. Prove theorem 1.1.6. 
10. 


Let f : X > Y. Show that the following are equivalent: 
(a) fis one-to-one. 

(b) f(A; N- Az) = f(A,) Nf(A2) for every Ay, A, CX. 
(c) f-'(f(A)) =A for every AC X. 

(d) f(x — A) € Y—f(A) for every A CX. 

Let f : X > Y. Show that the following are equivalent: 
(a) fis onto. 

(b) f( f-1(B)) = B for every BC Y. 

(c) f(X — A) 2 Y—f(A) for every A CX. 
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12. Show that the composition of two injective (respectively, surjective, bijec- 
tive) functions is injective (respectively, surjective, bijective). 

13. Let f: A > Band g : B > Cbe bijections. Show that (gof)~! = f~tog™!. 

14. (a) Show that iff : A > Bis injective, then there exists a function g: B—> A 
such that gof= I,. 

(b) Show that iff: A — Bis surjective, then there exists afunctiong: B> A 
such that fog = Ip. 

15. Show that the function f : Nx N > N given by f(m,n) = 27 !(2n— 1) isa 
one-to-one correspondence. 

16. Verify the one-to-one correspondence between 24 and P(A). 

17. Show that if A has n elements and B has m elements, then A? has n™ 
elements. Conclude that P(A) has 2” elements. 

18. Let S and T be subsets of a set A. Show that 
(a) Xsnr = Xs-Xrs and 
(b) Xsur = Xs + Xr — Xsnr- 

19. Prove theorem 1.1.7. 

20. Fix an integer n > 1, and define a relation R on Z as follows: xRy if x—y 
is a multiple of n. Show that R is an equivalence relation, and describe the 
equivalence classes. 

21. Define a relation R on R as follows: xRy if and only if x — y € Q. Show that 
R is an equivalence relation. 

22. Define a relation R on Z by xRy if and only if x* + y’ is even. Show that R 
is an equivalence relation. 

23. Define a relation R on R by xRy if and only if xy > 0. Is R an equivalence 
relation? 


1.2 The Real and Complex Number Fields 


An organized study of mathematics must be rooted in a proper understanding of 
number systems. Authors of textbooks such as this one are often divided between 
two extremes: either they provide an extensive development of number systems 
from scratch or they ignore the entire matter and consider knowledge of the 
real numbers to be a prerequisite. This presentation is a compromise between 
the two extremes. It is assumed that the reader has a thorough knowledge of 
integers and the rational number field, including such topics as divisibility, prime 
factorizations, the infinitude of the set of prime numbers, and the construction 
of Q in the usual manner as the quotient field of Z. We basically accept the 
completeness of real numbers as an axiom, then prove the Cauchy criterion and 
the Bolzano-Weierstrass property, which we decided to develop at length since it 
is a cornerstone theorem. The section concludes with the definition of complex 
numbers and a study of their basic properties, including completeness. Although 
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the section is not totally self-contained, there is value in its inclusion because 
it illustrates a number of important proof techniques and provides a succinct 
summary of the properties of real and complex number fields. 


Definition. Let F be a nonempty set endowed with two binary operations, + 
(addition) and x (multiplication). The triple (F, +, x) is said to be a field if the 
following conditions are satisfied for all a,b,c € F: 


(a)atb=b+a. 

(b)a+(b+c)=(at+b)+c. 

(c) There is an element 0 € F such thata +0 =a. 

(d) For every a € F, there is an element —a € F such that a+ (—a) = 0. 
(e)axb=bxa. 

(f)ax(bxc)=(axb)xc. 

(g) There is an element 1 € F such thatax1 =a. 

(h) For every a # 0, there is an element a! such that axa7' = 1. 

(i) ax(b+c)=axbt+axe. 


We often omit the symbol for multiplication and write ab or a.b for ax b. The 
element 0 is called the additive identity, and 1 is called the multiplicative identity 
of the filed. A field must clearly contain at least two elements. 


With the usual operations of addition and multiplication of numbers, the rational 
numbers, Q, and the real numbers, R, are fields. We will see later in this section 
that complex numbers also form a field. 


Example 1. Let p be a prime number. Define an equivalence relation = on Z by 
a=b if a—b is divisible by p. Since the remainder upon dividing a whole 
number by p is an integer between 0 and p—1, the equivalence classes 
containing 0,1,...,9—1 are all the equivalence classes of =. The field of p 
elements (also called the integers modulo p) consists of the equivalence classes 
of 0,1,...,p — 1 and is often given the symbol Zp. The equivalence class of an 
integer n is denoted n. Addition and multiplication in Z, are defined as follows: 
n+m=n-+m, and n.m=nm. This simply means that we add or multiply 
integers representing the class, then reduce the result modulo p. For example, 
let p= 7. Then6+5=11=4.4 


Real Numbers 
Definition. A subset A of R is said to be bounded above if there is a real number 


M such that, for all x € A,x < M. The number M is called an upper bound of 
A. A is said to be bounded below if there is a real number m such that, for all 
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x € A,x > m. The number m is called a lower bound of A. Finally, A is bounded 
if it is bounded above and below. 


It is clear that if M is an upper bound of A, then every real number greater than M 
is also an upper bound of A. This leads to the following definition. 


Definition. The least upper bound of a set A C R is the number M such that 


(a) M is an upper bound of A, and 
(b) for all € > 0, M—e is not an upper bound of A. Thus there is an element 
x € A such that x > M-—e. 


The least upper bound of A is also called the supremum of A and is denoted by 
supA. If A is not bounded above, we set supA = oo. 


Definition. The greatest lower bound of a set A C R is the number m such that 


(a) mis a lower bound of A, and 
(b) for all € > 0, m+e is not a lower bound of A. Thus there is an element 
x €A such thatx<m+e. 


The greatest lower bound of A is also called the infimum of A and is given the 
notation inf A. If A is not bounded below, we set infA = —oo. 


Example 2. Let A, = (—oo, 1), A, = {= :néNhand A; = oe : n © N}. Then, 
supA, = supA, = supA; = 1, infA, = —oo , infA, = 0, and infA; = 1/2.@ 


The completeness of R. We accept the following fact as true: 
Let A C R be bounded above. Then A has a least upper bound. 


The above fact is not trivial; its establishment requires delving deep into the very 
definition of the real numbers, which we will not do here. The following example 
shows that Q is not complete and illustrates that the completeness of R is not to 
be mistaken for a simple fact. 


Example 3. Let A = {x € Q : x? < 2}. Clearly, A is bounded above and below. 
However, supA and infA are not in Q. @ 


Definition. A sequence (a,,) of real numbers is said to converge to a € R if, for 
every € > 0, there is a natural number N such that |a,, — al < € for all n> N. In 
this case, we say the limit of (a,,) is a, and we write lim,_,,, 4, = a or simply 
lim,, 4, = a. 
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Definition. A sequence (a,,) of real numbers is said to diverge to oo if, for every 
M &R, there is a natural number N such that a, > M for all n > N. In this case, 
we also say that a, has limit oo, and we write lim,,a, = oo. The sequence (a,,) 
is said to diverge to —oo if, for every m € R, there is a natural number N such 
that a, < m for all n > N. In this case, we also say that a,, has limit —co, and we 
write lim, a, = —0o. 


Example 4. Let a, =n+ -,b, =1+(-1)",c, =e". 
n 
The sequence (a,,) diverges to co, while (b,,) does not converge, nor does it 
diverge to +00. Finally, lim,,c, = 0.4 


Example 5. If, for every n € N,a, <b, and lim, a, = a,lim, b, = b, thena < b. 
Suppose for a contradiction that b < a. Let € = (a— b)/3. Observe that b—¢€ < 
b+ée<a—€<ate. There exist integers N, and N, such that, forn > Nj, b, € 
(b—€,b+€), and for n > No, a, € (a—€,a+€). Now, for any n > max{N,, No}, 
b,, <a, which is a contradiction. ¢ 


Theorem 1.2.1. Ifa, <c, <b, and lim, a, = lim, b, = a, then lim,,c, =a. 


Definition. A sequence (a,,) is bounded if its range {a,,a),...} is a bounded set. 
Thus there is a positive number M such that, for all n EN, |a,,| <M. 


Theorem 1.2.2. A convergent sequence is bounded. @ 


Definitions. A sequence (a,,) is non-decreasing if a, < a, <.... 
A sequence (a,,) is (strictly) increasing if a, < a, <.... 
A sequence (a,,) is non-increasing if a, > a) >... 
A sequence (a,,) is (strictly) decreasing if a, > a, >.... 
A sequence (a,,) is monotonic if it is non-decreasing or non-increasing. 


Example 6. a,, = 1/n is decreasing, but b,, = (—1)"/n is not monotonic.@ 
Theorem 1.2.3. A monotonic sequence is convergent if and only if it is bounded. 


Proof. Without loss of generality, let (a,) be a bounded, non-decreasing sequence, 
and let A ={a, : n € N}. By assumption, A is bounded, so, by the completeness 
of R, a=supA exists. We show that lim,,a, = a. Let € > 0. There is an integer 
N>0 such that a—€ < ay. Because (a,,) is non-decreasing, for any n>N, 
a—€<ay<a,<a<ate; hence lim, a, =a. The converse is a special case of 
theorem 1.2.2. 
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Definition. A sequence a,, is said to be a Cauchy sequence if, for every € > 0, there 
is anatural number N such that, for all m,n > N,|a, —a,,| <€ 


Theorem 1.2.4. A convergent sequence is a Cauchy sequence. 
Theorem 1.2.5. A Cauchy sequence is bounded. 


Proof. Let € = 1. There is a positive integer N such that, for m,n > N, |a,—4,,|<1. 
In particular, taking m=N,|a,—ay|<1 for all n>N Thus, by the trian- 
gle inequality, for every n> N, |a,| = |(a, — ay) + ay| < |a, — @y| + lay] < 1+ 
|ay|. Let M = max{|a,|,...,|a@y_1|,1 + |ay|}. Clearly, |a,,| <M for all n. @ 


Definition. Let (a,) be a sequence, and let (1,,1,...) be a strictly increasing 
sequence of natural numbers. We say that (a, )¢2., is a subsequence of (a,.). 


Theorem 1.2.6. A subsequence of a convergent sequence is convergent to the same 
limit. Thus if lim, a,, = a and (a,,) is a subsequence of (4,,), then limp, 45 An, = 4. 


Proof. Let € > 0. Since lim,,a,, = a, there is a positive integer N such that, for n > N, 
|a,, — a| <€. Since (n,) is an increasing sequence of natural numbers, n, > k for 
everyk EN. Thus, fork > N,n, > Nand |a,,—a| <¢. 


Theorem 1.2.7. Every sequence (a,,) contains a monotonic subsequence. 


Proof. Define a term a,, of the sequence to be a peak if, for every i > n,a, > a;. There 
are two cases: 


Case 1. The sequence (a,,) has finitely many peaks. Suppose ky is the largest positive 
integer for which a,, is a peak, and let n, = ky + 1. Since a,,, is not a peak, there is 
an integer n. > n, such that a, > a,,. Continuing inductively, one can construct 
a strictly increasing sequence of positive integers n, < ny < N3,... such that a,, < 
Ay, < Ay, ... The sequence (a,,) is an increasing subsequence of (4,). 

Case 2. The sequence, (a,,) contains infinitely many peaks, bg As gy = A 
where nj, is an increasing sequence of positive integers. The sequence (a,,) is a 
non-increasing subsequence of (a,,). 


Theorem 1.2.8 (the Bolzano-Weierstrass theorem). Every bounded sequence 
contains a convergent subsequence. 


Proof. Let (a,) be a bounded sequence. By the previous theorem, (a,) contains a 
monotonic subsequence, (4, ), which is convergent by theorem 1.2.3. i 
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The following two examples are fixtures in undergraduate real analysis books. The 
proof technique is quite common and is valid for general compact sets. See chapter 
4. First we remind the reader of the definition of continuity. 


Definition. Let X be a subset of R, and let f: X > R. We say that fis continuous 
at a point x € X if, for every € > 0, there exists 6 > 0 such that | f(y) — f(x)| <¢ 
whenever y € X and |y — x| < 6. The function f is said to be continuous on X if 
it is continuous at every point x € X. 


It is easy to see that if fis continuous at x, and x, € X is such that lim, x, = x, then 


lim, f(*,) = f(x). 


Example 7. A continuous real-valued function f on a closed bounded interval 
[a,b] is bounded and attains its supremum and infimum values. 


Suppose, for a contradiction, that f is unbounded. Without loss of generality, 
assume that sup,cta,p/(x) = 00. There exists a sequence (x,,) in [a,b] such that 
lim, f(x,) = co. By the Bolzano-Weierstrass theorem, (x,,) contains a convergent 
subsequence x,,. Because [a,b] is closed, x = lim,x,, € [a,b]. Now we have the 
following contradiction: f(x) = limg f(xy, ) = 00. 

The proof that f attains its supremum and infimum values replicates the above 
argument. 


Definition. Let X be a subset of R, and let f : X > R. We say that fis uniformly 
continuous on X, if for every € > 0, there exists 6 > 0 such that | f(y) — f(x)| < € 
whenever x,y € X and |y—x| <6. 


The number 6 in the above definition depends on ¢€ only and not on any particular 
x € X. For example, the function f : (0,1) > R defined by f(x) = 1/xis continuous 
but not uniformly continuous. 


Example 8. A continuous real-valued function f on a closed bounded interval 
[a, b] is uniformly continuous. 


Suppose that f is not uniformly continuous. Then there exists a positive 
number € such that, for every n EN, [a,b] contains a pair of points x, and 
yn such that |x, —y,|< ~ and |f(y,) —f(x,)| 2 €. By the Bolzano- Weierstrass 
theorem, (x,,) contains a convergent subsequence x,,. Let x = lim, x,,. Observe 
that xe[a,b] and limyy,, =x. Now, for all KEN, [fn —fGn,| = €- 
This is a contradiction because if we take the limit as k > oo of the left- 
hand side of the last inequality and use the continuity of f, we obtain 


0 = | f(x) —f@)| = lime |fOn,) —fOn)| 2 €- 
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Theorem 1.2.9 (the Cauchy criterion). A sequence in R is a Cauchy sequence if 
and only if it is convergent. 


Proof. By theorem 1.2.4, every convergent sequence is a Cauchy sequence. To prove 
the converse, let (a,) be a Cauchy sequence. By theorem 1.2.5, (a,) is bounded; 
hence, by theorem 1.2.8, (a,) contains a convergent subsequence, (a,,). Let 
lim,a,, =a. We show that lim,,a, =a. Let ¢ > 0. Since (a,,) is Cauchy, there 
is a positive integer N such that, for n,m > N,|a, — a,,| < €/2. Since lim,a,, = a, 
there is an integer K such that for k>K,|a,,—a|<¢/2. Without loss of 
generality, we may assume that K> N; thus nx >K>N. Taking m= nx and 
using the triangle inequality, for all n> N,|a,—a| <|a,—4y,|+|an,—a| < 
é/2+¢/2=€. 0 


Example 9. The rational field Q does not satisfy the Cauchy criterion. For 
example, the sequence ox - is a Cauchy sequence in Q, but its limit, e, is 
ee 


notinQ. ¢ 


Remark. The completeness of R is, in fact, equivalent to the Cauchy criterion. 
See example 10 below. This is why the Cauchy criterion is sometimes used as a 
definition of the completeness of R. 


Definition. Let A be a subset of R. A real number x is called a limit point of A if, 
for every 6 > 0,(x—6,x+6)MA contains a point other than x. 


Theorem 1.2.10 (the Bolzano-Weierstrass property of bounded sets). Every 
bounded infinite subset A of R has a limit point. 


Proof. Let I, = [a,b] be a closed bounded interval that contains A. Bisect I, into 
two congruent closed subintervals. One of the two subintervals contains infinitely 
many points of A. Denote that interval by I,. Continuing this process produces 
a sequence of subintervals I, D I, 2... such that ANI, is infinite for alln EN, 
and the length of I, 1,) = = For each n€N, pick a point a, € ANI, If 
m>n,I, 2D In, and an,a, € 1,3 hence |a, —a,| < = Since lim,, oe = 0,(a,) 

is a Cauchy sequence. Let a=lim,a,. Since a; € 1, for alli>n,a€lI,, for all 

n (see exercise 9 at the end of this section). Now let 6 > 0. Since lim, I(,,) = 0, 

and a€Ne1I1,,1, C(a—6,at+ 4) for sufficiently large n. Thus (a—6,a+6) 

contains infinitely many points of A because I, does. In particular, (a—6,a+ 

6) NA contains a point other than a. 


Example 10. The completeness of R is equivalent to the Cauchy criterion. The fact 
that the completeness of R implies the Cauchy criterion has been established 
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in theorem 1.2.9. Observe that the proof of theorem 1.2.9 depends heavily 
(through the intervening theorems) on theorem 1.2.3, where the completeness 
of R was crucial. We now prove that the Cauchy criterion implies the complete- 
ness of R. 


Let A be a subset of R that is bounded above, pick an element ay € A, and let by 
be an upper bound of A. We construct two sequences (a,,) and (b,,) such that 


(1) each a, € A, and each b,, is an upper bound of A; 
(2) Lan+1> bi+1] Cc la, oA and 
(3) ba, < 

by —a 


by— 
Consequently, 4,41 — a, < ar and by — bn41 S$. 


Suppose dj,...,4, and b,,...,b, have been found. We define a,4, and b,,, as 
follows. Let m= “"**. If m is an upper bound of A, let a,,; =4a,, and let 
b,.4, =m. If m is not an upper bound of A, choose an element a,,,, € A such 


that m <a,4, < b,, and define b,4, = b,.? 


We now show that (0,,) is a Cauchy sequence. Let € > 0, and choose an integer N 
such that (bp — ay)/2N~! < €. For m > n > N, we have 


|b, = Bn| = b, ~ Dai = (6, > Bn41) a5 (On41 a B42) Tet (m1 7. by) 
< (by — ap) [1/2" +... + 1/2] < (by — ag)[1 + 1/2 +... ]/2” 
= (bo = ay)/2""! < (bo = ay)/2N-! <eé 


By the Cauchy criterion, the sequence (b,) has a limit, say, b. An argument 
identical to the one above shows that (a,,) is convergent, and since b, —a, < 
(bo = ay)/2", lim, an = b. 


Finally, we prove that b = supA. Ifa > b for an element a € A, then a > b,, for some 
n, which contradicts the fact that b, is an upper bound of A. Thus b is an upper 
bound of A. For any number c < J, let € = b—c. Since lim, a,, = b, there exists an 
integer n such that a, € (b—€,b +). In particular, a, > c; hence c is not an upper 
bound of A. ¢ 


> Observe that if a,,; = b,, the process terminates and b, is the least upper bound (in fact, 
the maximum) of A. Otherwise, the process continues ad infinitum, and each b,, is a strict upper 
bound of A. 
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Definition. The extended real line is R = R U {—0o, co}. We need this extension 
of R because the limits of some sequences are infinite and because it is some- 
times convenient to allow functions to take infinite values. We retain the usual 
ordering on R, and, for x € R, we define —co < x < oo. The following rules of 
arithmetic in R are convenient and widely accepted: 


(a) For a real number a,a+ oo = 0,a— 0 =—0oo. 
(b) If a>0, then a.co=oo, and a.(—coo)=—oo, while if a<0, then 
a.co = —oo, and a.(—00) = oo. 


(c) For any real number a, a/ + oo = 0. 
(d) In chapter 8, we adopt the convention that 0.co = 0. However, this defini- 
tion is specific to integration theory. 


We do not define the operation co — oo. 


Definition. A point a€R is a limit point of a sequence (a,) if, for every 
€>0,|a, —a| <e for infinitely many n. We say that oo is a limit point of a, 
if, for every M € R,a,, > M for infinitely many n. Likewise, —oo is a limit point 
of a, if, for every m € R,a,, < m for infinitely many n. 


Remark. Not every limit point of a sequence is a limit point of its range. For 
example, the sequence a,, = (—1)” has two limit points, +1, while its range, the 
set {—1, 1}, has no limit points. 


Theorem 1.2.11. An extended real number a is a limit point of (a,) if and only if 
there exists a subsequence (a,,) of (a,) such that lim, ay, = a. 


Proof. Suppose aE R is a limit point of (a,). There exists ny EN such that 
|a,, —a| <1. Now we can find an integer n, > n, such that |a,, —a| < 1/2. 
Continuing this construction produces a sequence n, < ny <n; <... of integers 
such that |a,, — a| < 1/k. Thus lim,.a,, = a. The converse is trivial. We leave it to 
the reader to provide the details when a = +oo. 


Definition. Let (a,,) be a sequence, and consider the sequences 


Ay = UPE>n Ay, and 6, = inf, Ag. 


Clearly, @,, is non-increasing, and ,, is non-decreasing. Therefore lim, @,, and 
lim, 8, exist. We define the limit superior, or the upper limit and the limit 
inferior, or the lower limit of (a,,), respectively, as follows: 


lim sup, 4, = lim, @, = lim, supjsn4x = infyen ns 
liminf, a, = lim, 6, = lim, infisndk = SUPnenBrn- 
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Theorem 1.2.12. a = limsup, a, if and only if, for every € > 0, 


(a) there is a positive integer N such that a, <a+e foralln > N, and 
(b) a, > a —€ for infinitely many n EN. 


Proof. Suppose a = limsup,,a,. Since a = inf,@,, there is a positive integer N such 
that ay <a+e. Now, because a, < a, and a, is non-increasing, a, <A, < ay < 
a+eé, for alln > N. This proves (a). 

To prove (b), note thata—€ <a <a, =sup{ay,a,...}. Thus there is a positive 
integer n, such that a,, > a—€. Nowa —€ <& < Oy, 41 = SUPLAn, 41,4n, 425+}: 
Thus there is a positive integer n, > n, such thata,, > a —¢. This process produces 
a subsequence a,, of a, such that a, > a—€. 

To prove the converse, suppose a € R satisfies conditions (a) and (b), and let 
€ > 0. By condition (b), for every n EN, there exists an integer k > n such that 
a, >a—e€. Thus A, = supp nay = a —€. Taking the limit as n > co, produces 
lim sup, 4, = a —€. Since € is arbitrary, limsup,, An =O. 

By condition (a), there exists an integer N such that a, < a+ € for everyk > N. 
Thus, for every n> N, @, = SUppsn% Sa+€. Taking the limit as n > oo, we 
obtain limsup,, a, < a+ €. Because ¢ is arbitrary, limsup,a, <a. 


Theorem 1.2.13. The upper limit of a sequence (a,,) is the largest limit point of (a,,). 


Proof. Leté > 0. By the previous theorem (and its proof), there is a positive integer N 
such that, for alln > N, a, < a +€, anda subsequence (a, ) such that a, > a—€. 
Since lim,.n, = 00, there is a positive integer K such that n, > N for all k> K. 
Therefore a—€ <a, <a +¢ for allk > K. By theorem 1.2.11, & is a limit point 
of (a,). If t is a limit point of (a,,), then, for infinitely many positive integers n, 
t—€ <a, By theorem 1.2.12, there is an integer N such that, for alln > N, a, < 
a +e. Choosing n large enough for the last two inequalities to be simultaneously 
satisfied, we have t—€ <a, <a+eé. Therefore t< a+2e. Since ¢ is arbitrary, 
t<a.l 


Theorem 1.2.14. A sequence (a,) converges to a if and only if limsup, a, = 
liminf,,a,, = a. 


Proof. Let a =limsup,a,,8 =liminf,,a,, and suppose that a = 8. By theorems 
1.2.12 and problem 17 at the end of this section, there is a positive integer N such 
that, forn > N,a,<a+¢eanda—€ =P —€ <ay,. Thus, forn > N,a—€ <a, < 
a +e; hence lim,,a, = a&. Conversely, if lim,,a, = a, then it is easy to verify that 
the conditions of theorem 1.2.12 and those of problem 17 are met with a = a and 
B =a, respectively. Hencea = 6. Hl 
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Complex Numbers 


Definition. A complex number z is an ordered pair (x, y) of real numbers. We use 
the symbol C for the set of complex numbers. 


The definition so far makes C nothing more than the Euclidean plane R*. This 
is why the set of complex numbers is also called the complex plane. What sets C 
apart from R? is the following pair of binary operations. 


Definition. For complex numbers z= (x,y) and w = (a,b), we define the sum 
z+w=(x+a,y+b) and the product zw = (ax — by,ay + bx). The real field R 
is embedded into the complex plane in a natural way: we identify a real number 
x with the complex number (x,0). Under the operations of complex addition 
and multiplication, the subset R = {(x,0) € C : x € R} is closed in the sense 
that if z and w are in R, then z+ w and zw are in R. Indeed z+ w = (x +4,0) 
and zw = (ax,0). From now on, we make no distinction between R and R and 
simply write x for (x,0). With this understanding, we see that if x € R, then 
xw = (x,0)(a, b) = (xa, xb). It is also straightforward to verify that the elements 
0 = (0,0) and 1 = (1,0) satisfy z+ 0 =z and z.1 =z for all z € C. Thus 0 and 1 
are the identity elements for complex addition and multiplication, respectively. 


Definition. The complex number i = (0, 1) is called the imaginary number. Now 
i? = (0,1).(0, 1) = (—1,0) = —1. We therefore think of ias the square root of —1. 


Armed with the imaginary number i, we now have a convenient and notationally 
simple way to represent complex numbers. An arbitrary complex number z 
can be written as z= (x,y) = (x,0)+ (0,y) =x+y(0,1) =x+iy. With this way 
of representing complex numbers, we can restate the definitions of complex 
addition and multiplication as follows. For complex numbers z=x+iy and 
w=artib,z+w=(x+a)+ily+b) and zw = (ax— by) + i(ay+ bx). Note that 
the complex operations obey the same rules as the addition and multiplication of 
linear polynomials, taking into account that i? = —1. Indeed, if we multiply out the 
product of the binomials x + iy and a + ib according to the usual rules of algebra, 
we obtain (x + iy)(a+ ib) = ax + iay + ibx + i? by = (ax — by) + (ay + bx)i, which 
is consistent with the original definition of complex multiplication. Now that 
we have a convenient way of manipulating complex numbers, we can prove the 
following theorem. 


Theorem 1.2.15. With the operations of complex addition and multiplication, C is 
a field. 
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Proof. Most of the defining properties of a field are easy to verify. As a sample of the 
calculations, we verify the following two properties: 


(a) Complex multiplication is associative. Let z= x + iy,w =a + ib, and let t= 
r+is be complex numbers. Then (zw)t = [(ax — by) + i(ay + bx)|(r + is) = 
r(ax — by) — s(ay + bx) + i[r(ay + bx) + s(ax — by)]. The reader can calcu- 
late z(wt) and reconcile the result with the above expression for (zw)t. 

(b) A slightly less obvious fact is the inversion formula of a nonzero complex 
number. If z= x+iy #0, then z7! = a - pear It is easy to verify that 
zz |=1.8 


Definition. For a complex number z= x + iy, x is called the real part of z, and y 
is the imaginary part of z. We use the notation x = Re(z), and y = Im(z). 


Definition. The complex conjugate of a complex number z = x + iy is the number 
z= x-— iy, and the absolute value (or modulus) of z is |z| = x? +. y?. 


Theorem 1.2.16. Ifz and w are complex numbers, then 


(a)z+w=z+wandzw=Zzw; 

(b) 2+. Z= 2Re(z) and z—z = 2iIm(z); 

(c) |z| = |z| and zz = |z|’; 

(d) |Re(z)| < |z|, [Im(z)| < |z|, and |z| < |Re(z)| + |Im(z)|; 
(e)z = ae and 


(f) the triangle inequality |z+ w| < |z|+|w. 


Proof. The proofs are mostly computational and are left to the reader to check. We 
prove the triangle inequality below. 

Note that zw is the conjugate of zw; hence zw+zw = 2Re(zw) < 2|zw| = 
2|z||w| = 2\|z||w|. Using this, we have |z+w|*? =(z+w)(Z+w) =2z+zZwt 
zw + ww < |z|? + 2|z||w] + ||? = (|z| + ||)”. The result follows by taking the 
square roots of the extreme sides of the above string of inequalities. 


Now that we have a measure of the length of a complex number, we have a measure 
of the distance between two points in the complex plane. For complex numbers 
Z, and z,, the quantity |z, — z,| is exactly the Euclidean distance between z, and 
Z). Now we can generalize many of the properties of subsets of the real line to 
the complex plane. For example, a bounded subset of C is a set A of complex 
numbers such that sup{|z| : z € A} < oo. Fora complex number a, and a positive 
real number 6, the set {z € C : |z—a| < d}is an open disk of radius 6 and centered 
at a. A point z € C is a limit point of a set A of complex numbers if every open 
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disk centered at z contains points of A other than z. We urge the reader to examine 
the rest of the concepts we studied for real numbers and generalize them to the 
complex field, whenever possible. One important distinction between R and C is 
that there is no natural (or useful) way to order the complex field. We conclude the 
section with the following theorem. 


Theorem 1.2.17 (completeness of the complex field). A complex sequence (Z,,) is 
a Cauchy sequence if and only if it is convergent. 


Proof. Let Z, =x, +iy,, where (x,) and (y,,) are real sequences. If (z,) is Cauchy, 
then, given €>0, there exists a positive integer N such that, for n,m>N, 
IZn —Zm| <€. It follows that the real sequences (x,) and (y,) are Cauchy 
sequences, since |X_—Xm_| < |Z, —Zm| and |Vn —Vin| < |Zn — Zml- By the complete- 
ness of R, (x,) and (y,) converge to real numbers x and y, respectively. Clearly, 
(z,) converges to z= x+iy because |z, —z| <|x, —x|+|y,—y|. We leave the 
proof of the converse to the reader. 


The following example establishes the Bolzano- Weierstrass theorem for complex 
sequences. 


Example 11. Every bounded complex sequence contains a convergent subse- 
quence. Let z, =x, +iy, be a bounded sequence in C. Since |x,| < |z,|, (,) 
is bounded. By theorem 1.2.8, (x,) has a convergent subsequence (x, ). Let 
x = lim,x,,. Now the sequence (y,,) is bounded, so it contains a convergent 
subsequence Vn,)- Let y= lim, Vn,)- The subsequence Zn, = Xn, + Yn, of 
(z,) clearly converges to x+ iy as p > 00. 


Exercises 


1. Prove that the finite union of bounded subsets of R is bounded, and give an 
example to show that the conclusion is false for an infinite union of bounded 
sets. 

2. Prove that if A C R is bounded below, then A has a greatest lower bound. 
Hint: Define —A = {—x : x € A}. Show that inf A = —sup{—A}. 

3. Prove that if lim,,a, = a,lim,,b, = b, then lim,(a, + b,) =a+b and that 
lim,,(a,,b,,) = ab. 

4. Let lim, b,, = b #0. Prove that there is a natural number N such that, for 
all n> N,|b,,| > |b|/2. Hence prove that if, in addition, lim,,a, =a, then 


lim, = = -. 
"4 b 


5. Prove theorem 1.2.1. 
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10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 
19. 


20. 
21. 
22. 
23. 


. Prove theorem 1.2.2. 

. Prove theorem 1.2.4. 

. Show that the limit of a convergent sequence is unique. 

. Suppose all the terms of a sequence (a,,) are in a closed interval I. Prove that 


iflim, a, =a, thena EI. 

Show that lim,, a, = a if and only if every interval centered at a contains all 
but finitely many terms of the sequence. 

Let (6,,) be a positive sequence such that ae 6, < oo. Show that if (a,,) is 
such that |a,,4; — 4,| <6, then (a,) is convergent. Hint: Examine the proof 
in example 10. 

Prove that a point x is a limit point of a subset A of R if and only if every 
interval centered at x contains infinitely many points of A. 

Show that if A is bounded above and a= supA, then there is a sequence 
(a,,) in A such that lim,,a, =a. Also prove that if a ¢ A, the terms of the 
sequence (a,,) can be chosen to be distinct. 

Prove that if {I,,},cx is a descending sequence of closed bounded intervals, 
then N,en!,, is a closed interval or a point. Give examples to show that the 
result is false if either of the conditions closed or bounded is omitted. 

Let A and B be nonempty subsets of Q such that AUB=Q and, for 
every a € A andevery b € B,a < b.* Prove that exactly one of the following 
alternatives holds: 

(a) there exists a number a € Q such that A = QN (—oo, a], 

(b) there exists a number @ € Q such that A = QN (—00, @), or 

(c) there exists a number a € R — Q such that A = QN (—«~, a). 

Suppose (a,,) and (b,,) are real sequences. Prove that 

(a) liminf,, a, < lim sup, a,; 

(b) lim inf,,(—a,,) = —limsup, a, and lim sup, (—a,,) = —liminf,, a,; 

(c) liminf, a, + liminf, b,, < liminf,(a, + 0,); 

(d) limsup,,(a, + 6,) < limsup, a, +limsup, b,; and 

(e) if a, < b,, then limsup, a, < limsup, },,. 

Prove that 6 = liminf,, a, if and only if, for every € > 0, 

(a) there is a positive integer N such that a, > 6 —€ for all n > N, and 

(b) a, < 6 +€ for infinitely many n EN. 

Show that lim inf, a, is the smallest limit point of (a,,). 

Let a,, be a positive sequence. Prove that lim sup, ~ eS 
Verify the details of the proof of theorem 1.2.15. : ioe 

Verify the details of the proof of theorem 1.2.16. 

Verify the details of the proof of theorem 1.2.17. 

Prove that every bounded infinite subset of C has a limit point. 


1 


* Such a partition of Q is called a Dedekind cut. 


24, 


25. 


26. 
27. 
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(a) Show that the series baa = is absolutely convergent for all z € C. 


(b) Define e* = ee =. Show that, for all z, w € C, e?e” = e+”. Conclude 
that, for n € N, and z E C, (e”)" = e”. Hint: Recall that absolutely conver- 
gent series can be multiplied term by term. The reader will recognize e’ as 
the complex exponential function. 

(a) Show that, for @ € R, e” = cos@ + isin@. Hint: Recall that the terms of 
an absolutely convergent series can be rearranged without affecting the sum 
of the series. 

(b) Show that if z is a nonzero complex number, then there is a unique 
positive number r and a unique real number 6 € [0, 277) such that z = re’. 
Hint: Write z = |z|w, where w = a Note that |w| = 1. 

Show that, for 8 € R, (cos@ + isin®@)" = cos(nO) + isin(n@). 

Let zbe anonzero complex number, and write z = re? Show that, for n > 2, 
each of the numbers & = r!/"e(@+274/n 9 < k <n—1, satisfies &' = z. The 
numbers &, are the n” roots of z. 


y) 
Set Theory 


A false conclusion once arrived at and widely accepted is not easily dislodged 
and the less it is understood the more tenaciously it is held. 
Georg Cantor 


Georg Cantor. 1845-1918 


Georg Cantor entered the Polytechnic of Ziirich in 1862 to study engineering. He 
later moved to the University of Berlin, where he attended lectures by Weierstrass, 
Kummer, and Kronecker, completing his dissertation on number theory in 1867. 


In 1873 Cantor proved the countability of the set of rational numbers. He then 
proved that the real numbers were uncountable and published the result in 1874. 
It is in that paper that the idea of a one-to-one correspondence appeared for the 
first time. He next pondered the question of whether the unit interval could be 
put in a one-to-one correspondence with the unit square. He initially dismissed 
the possibility and wrote that “the answer seems so clearly to be ‘no’ that proof 
appears almost unnecessary.” When he did prove the result, he wrote to Dedekind 
in 1877, “I see it, but I don't believe it!” In a paper published in 1878, he made the 
concept of one-to-one correspondence precise and discussed sets of equal power, 
that is, sets which have equal cardinality. 
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Between 1879 and 1884, Cantor published a series of six papers designed to 
provide a basic introduction to set theory, and this is when he realized that his 
work was not finding the acceptance that he had hoped. In fact, Cantor’s ideas 
earned him the strong antagonism of Kronecker, among other mathematicians 
and philosophers. Dedekind was sympathetic to Cantor’s work and in 1888 wrote 
his article Was sind und was sollen die Zahlen [What are numbers and what should 
they be], partially in defense of Cantor’s work. 


Cantor’s last major papers on set theory appeared in 1895 and 1897, where he 
had hoped, without success, to include a proof of the continuum hypothesis. He 
did, however, succeed in formulating his theory of well-ordered sets and ordinal 
numbers. It was also during those years that Cantor discovered the first paradoxes 
of set theory. 


Hilbert described Cantor’s work as “the finest product of mathematical genius and 
one of the supreme achievements of purely intellectual human activity.” 


Cantor’s personal life was not entirely a happy one. For more than thirty years, 
Cantor was troubled with bouts of depression, and, in 1899, he suffered the death 
of his youngest son. He spent the last year of his life confined to a sanatorium, 
where he died of a heart attack. 


2.1 Finite, Countable, and Uncountable Sets 


Cantor’s revolutionary ideas were initially focused on understanding infinite sets. 
His starting point was, as is ours in this chapter, set equivalence. The title of the 
section accurately captures its objectives: to formulate clear definitions of finite 
and infinite sets, and to study their properties in good detail. Among the results 
we establish are Dedekind’s definition of an infinite set, the countability of Q, and, 
in general, the countability of a countable union of countable sets. We conclude the 
section by showing the existence of uncountable sets through the establishment of 
the fact that 2‘ and R are uncountable. 


Definition. Two sets A and B are equivalent if there is a bijection from A to B.’ 
We use the notation A ~ B to indicate the equivalence of A and B. 


Example 1. The set 2N of even positive integers is equivalent to N. The function 
f : N= 2N defined by f(n) = 21 is bijective. 4 


' The term equipotent is also used. 
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Example 2. The closed interval [0,1] is equivalent to an arbitrary closed interval 
[a, b] (a < b). The function f(x) = = is a bijection from [a, b] to [0,1]. @ 


Example 3. The closed interval [0,1] is equivalent to the open interval (0, 1). 
Define a function f : [0,1] — (0, 1) as follows: 


1/2 ifx=0, 
f(x) = 41/42) ifx=1/n,n EN, 
x otherwise. 


It is easy to verify that fis a bijection. 


Example 4. Let A = (—77/2, 77/2), B= R. The function f(x) = tan(x) is a bijection 
from A to B. Thus A x B. 


Theorem 2.1.1. Let A,B and C be sets. Then 


(ajARA. 
(b) If A = B, then Be A. 
(c) IfA* Band BxC, then Aw C. 


Proof. (a) The identity function I, : A > A is a bijection. 
(b) If f : A > Bisa bijection, then f~' : B > A is a bijection. 
(c) Iff: A— Band g : B > Care bijections, then gof : A > Cis a bijection. ™ 


Definitions. For n EN, let N, = {1,2,...,n}. A set A is finite if A ~N,, for some 
néN. A set is said to be infinite if it is not finite. If A + N,,, we say that the 
cardinality of A is n, and we write Card(A) = n. The cardinality of a finite set is 
simply the number of elements in it. We also define Card(@) = 0. 


Theorem 2.1.2. 
(a) A proper subset B of N,, is finite, and Card(B) = m for some m <n. 
(b) A proper subset B of a finite, set A is finite, and Card(B) < Card(A). 
(c) If m,n € N and m <n, then there is no injection from N,, to N,,. 
(d) A finite set is not equivalent to any of its proper subsets. 
(e) N is infinite. 


Proof. (a) We proceed by induction. The only proper subset of N, is @, and 
Card(@) =0<1=Card(N,). Suppose the statement is true for some integer 
n> 1, and let B be a proper subset of N,4. If n+1¢B, then BCN,, and, 
by the inductive hypothesis, B is finite and Card(B) <n<n+1. Otherwise, 
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B={n+1}UC, where C is a proper subset of N,,. By the inductive hypothesis, 
C is finite, and Card(C) =m <n. Let g be a bijection from N,, to C. Define 
i Nin+1 > 8B by 


_\gx) fx EN, 
Fa) = n+1 ifx=m+1. 


Clearly, f is a bijection; hence Card(B) =m+1<n+1. 


(b) Suppose Card(A) = n, and let f be a bijection from A to N,,. The restriction of f 
to Bis a bijection from B onto f(B). By part (a), f (B) is finite, and Card( f (B)) < n; 
thus Card(B) = Card( f(B)) <n. 


(c) If f is an injection from N,, to N,,, then B= f(N,,) is a subset of N,,. By part 
(a), n = Card(N,,) = Card(B) < m. This contradiction shows that no such f exists. 


(d) Suppose B is a proper subset of a finite set A, and let n = Card(A), and m= 
Card(B). By part (b), m<n. Letg: BoN,, andh:N, >A be bijections. If 
there is a bijection f : A > B, then gofoh would be an injection from N,, to N,». 
This contradicts part (c). 


(e) If, for some positive integer m, there exists a bijection f: NN, then, for 
any integer n > m, the restriction of f to N,, would be an injection from N,, into 
Nn. This contradicts part (c) and completes the proof. 


Corollary 2.1.3. If A is infinite and A C B, then B is infinite. 


Proof. If B were finite, A would also be finite by theorem 2.1.2. 


Theorem 2.1.4. A set A is infinite if and only if it contains a sequence of distinct 


elements. 


Proof. Suppose A is infinite. First we show that, for n € N, A contains a set of exactly 


n elements. The proof is inductive, and here is the inductive step: Having found a 
subset {a,...,a,} of A containing exactly n elements, we pick an element a,4, € 
A—{aj,...,a,}. Such an a, exists because otherwise A would be be equal to 
{a,,...,4,}, which is finite. The set {a,,...,a,4,} has exactly n+ 1 elements. 

For n = 0,1,2,..., let B, be a subset of A of exactly 2" elements. Define Cy = Bo, 
and, fornéN, let C,, = B,, —U%29 B;. Now Card(Uizy B;) < ye. =o? 1, 
Hence Card(C,,) > 2" —(2" — 1) = 1. Thus the sets C,,are disjoint and nonempty. 
We choose an element c, from each C,, and we obtain a sequence of distinct 
elements of A. The converse is true because N is infinite. 


SET THEORY 29 


Theorem 2.1.5. A set A is infinite if and only if it is equivalent to one of its proper 
subsets. 


Proof. If A is equivalent to one of its proper subsets, A is infinite by theorem 2.1.2(d). 
Conversely, if A is infinite, by theorem 2.1.4, A contains a sequence of distinct 
elements (b,,b3,...). Let B = {b,, bp,...} and define a function f : A> A—{b} 
as follows: forn EN, f(b,) = by 41, and f(x) = xifx € B. As fis clearly a bijection, 
ArA-{b,}. Et 


Example 5. The closed interval [0, 1] is infinite because it is equivalent to its subset 
(0, 1). (See example 3.) 


Definition. A set A is countable if it is equivalent to N. A bijection f: N>A 
is called an enumeration (sequencing) of A. A set is said to be at most countable 
if it is finite or countable. If an infinite set is not countable, we say it is 
uncountable. 


Theorem 2.1.6. N x N is countable. 


Proof. We enumerate N x N recursively as follows: f(1) = (1,1), and once f(n) has 
been defined, say, f (n) = (a,b), define 


Crs (a—1,b+1) ifa>1, 
(b+1,1) ifa=1. 

We pictorially think of NXN as the integer points in the open first quadrant 
of the plane. The above enumeration sequences the integer points in the open 
first quadrant along each diagonal a+ b=constant,a,b EN, from bottom to 
top. Once the top of a diagonal has been reached, we start at the bottom of the 
next diagonal. See figure 2.1. It is clear that f is a bijection from N to NXN. 
The enumeration trick in this proof is attributed to Cantor. Another proof of this 
theorem is provided by exercise 15 on section 1.1. 


Theorem 2.1.7. An infinite subset B of a countable set A is countable. 


Proof. Let A = {a,,a,...} be an enumeration of A. Let n, be the least positive 
integer such that a,, € B. Suppose we have found integers n, <n, <... <j such 
that, for 1 <i<k—1, nj,, is the least positive integer greater than n; for which 
An,,, © B. Define nz, to be the least integer greater than n, such that a,,,, € B. 


Nk+1 
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1+ > 
1 2 3 4 


Figure 2.1 Cantor’s trick 


We claim that B = {a,, : k € N}. If not, then there exists an element b € B such 
that b#a,, for allk EN. Now b € A, so b=a,, for some positive integer n. By 
assumption, n # n, for allk EN. Because n, is a strictly increasing sequence of 
positive integers, there are two possibilities: either n < n, or there isa uniquek EN 
such that ny <n < N41. The former possibility would contradict the definition of 
n,, and the latter possibility contradicts the definition of ny,,. This shows that 
B={a,, > kEN}. 


Theorem 2.1.8. If there exists an injection f from a set A to N, then A is at most 


countable. 


Proof. Without loss of generality, assume A is infinite. Since f is one-to-one, R(f) 


(the range of f) is an infinite subset of N. By theorem 2.1.7, R(f) is countable. 
Therefore A is countable since it is in one-to-one correspondence with R( f). # 


Theorem 2.1.9. A set A is at most countable if and only if there is a surjection 


fiNoa. 


Proof. If A is countable, a bijection exists from N onto A. If A is finite and 


Card(A) =n, then there exists a bijection f : {1,2,...,n}— A. Extend f to a 
surjection f : N' > A by defining f(m) = f(n) for all m > n. 

Conversely, if there exists a surjection f : N > A, the sets S, = f~'(a),a € A, 
are mutually disjoint and nonempty. Choose an element n, from each of the sets 
S, and define a function g : A> N by g(a) = nq. Clearly, g is one-to-one. Now A 
is at most countable by theorem 2.1.8. 
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Theorem 2.1.10. A countable union of countable sets is countable. 


Proof. Let {A,} be a countable collection of countable sets, and let A = Uf, An. 
Write Ay = {An1,4n2,--.}. Define f : NXN > UR, A, by f(m, n) = Ain. Clearly, f 
is onto. By theorem 2.1.6, there exists a bijection g : N+ NXN. The composition 
fog maps N onto A. By theorem 2.1.9, A is countable. 


Corollary 2.1.11. Z and Q are countable. 


Proof. Use theorem 2.1.10 and the facts that Z=NU{O}U—-N and Q=UR, 
{":meZ.e 


Theorem 2.1.12. The set & of finite sequences in a countable set A is countable. 


Proof. For eachn EN, let A" be the family of sequences in A of exact length n. As 
a consequence of theorem 2.1.6 (see problem 9 at the end of this section), A” is 
countable. Since § = U7, A", & is countable by theorem 2.1.10. 


Theorem 2.1.13. 2% is uncountable. 


Proof. Recall that 2 is the set of all sequences from N in {0, 1} (binary sequences). 
Suppose, for a contradiction, that 2N is countable. Then 2™ = {X1,X>,-..}, where 
each x; is a binary sequence, say, x; = (X;1,Xj,--.) and each x; is 0 or 1. The binary 
sequence y = (y1,V2,...), where 


are 0 Pi =1, 
1 if Xj =0. 


is clearly not equal to any x;. This contradiction establishes the theorem. 
Corollary 2.1.14. Ifa set A contains at least two elements, then AN is uncountable. 
Proof. Pick two distinct elements ay and a, in A. By the previous theorem, {ag,a,} 
is uncountable. Consequently, AN is uncountable because it contains {ay,a,}. See 
problem 4 at the end of this section. 
Theorem 2.1.15. The interval (0, 1] is uncountable. Consequently, R is uncountable. 
Proof. Let T be the set of binary sequences which contain only a finite num- 


ber of nonzero terms. Each of the sets T,, of binary sequences of length n is 
finite, and T is equivalent to U?_,T,,. Therefore T is countable. It follows that 
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A= 2 —T is uncountable, by problem 6 at the end of this section. We construct 
an injection f from A to (0,1] as follows: for a sequence a= (a,,@,...) € A, 
define f(a) = ae = To prove that f is one-to-one, suppose a = (a,,4),...) and 
b=(b,,by,...) are distinct sequences in A. Let n be the least positive integer 
for which a, #b,, and assume that a, =1,b, =0. If t= ye 5 = a , 
then f(b) <t+ yey 1/2'=t+ 1/2". Since (a,) contains an infinite number 
of terms that are equal to 1, let m be such that m> n and a,, = 1. Now f(a) > t+ 
1/2" + 1/2" > f(b). Thus RC f) is uncountable. Since RC f) € (0,1] CR, both 
(0,1] and R are uncountable. See problem 4 at the end of this section. 1 


Remark. It is easy to see that the function f in the above proof is onto and hence 


A & (0, 1]. See problem 11 at the end of this section. 


Exercises 


. Show that any two (bounded or unbounded) intervals in R are equivalent. 
. Show that AX Bx BX A. 
. Show that if A = Band C x D, then AX C® BX Dand A® & BY. 
. Prove that if A C B and A is uncountable then B is uncountable. 
. Show that, for any two sets A and B, there exist disjoint sets C and D such 
that A ~ Cand Bx D. 
6. Prove that if A is a countable subset of an uncountable set B, then B— A is 
uncountable and B— A ® B. 
7. Letf: A> B. 
(a) Show that if fis onto and B is uncountable, then A is uncountable. 
(b) Show that if fis one-to-one and A is uncountable, then B is uncountable. 
8. Letg: NXN—N be the inverse of the function fin the proof of theorem 
2.1.6. Show that, for (a,b) EN XN, g(a, b) = “(a +b—2)at+b-1)+b. 
9. Show that if A is a countable set, then A” is countable. 
10. Let A be a countable set. Show that the collection of finite subsets of A is 
countable. 
11. In connection with the proof of theorem 2.1.15, show that A ~ (0,1] # R. 
12. A real number «x is said to be algebraic if it is a root of some polynomial 


aA WwW NY 


equation with integer coefficients. For example, V2 is algebraic. Prove that 
the set of algebraic numbers is countable. If a real number is not algebraic, 
it is said to be transcendental. It can be shown, for example, that e and 7 are 
transcendental numbers. Conclude that the set of transcendental numbers 
is uncountable. 
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2.2 Zorn’s Lemma and the Axiom of Choice 


The axiom of choice is one of the most useful tools in set theory. Although 
it is easy to state and widely accepted, the axiom of choice has also generated 
much controversy among mathematicians. In this section, we study the axiom 
of choice and its most famous and widely applicable equivalent: Zorn’s lemma, 
which is an indispensable tool in this book. The section and the section exercises 
contain typical but illuminating illustrations of how Zorn’s lemma is applied. In 
this section, we also study partially ordered, linearly ordered, and well-ordered sets 
and establish results such as the Schréder-Bernstein theorem, which will help us 
study cardinal numbers in the next section. Although ordinal numbers have been 
avoided in this book, the section exercises are largely focused on well-ordered sets. 


Definition. Let A be a nonempty set. A partial ordering on A is a relation < on 
A such that, for all x,y, and z € A, 


(a) x <x, 
(b) ifx<yandy <z, then x <z, and 
(c) ifx<yandy <x, thenx=y. 


Ifx <yand x Fy, we write x < y. 
A relation satisfying condition (c) is called antisymmetric. 


Definition. Let A be a nonempty set. A partial ordering < on A is said to be a 
linear (or total) ordering if it also satisfies the condition that, for x,y € A, either 
x <y or y <x. In this case, we say that A is linearly ordered by <. A linearly 
ordered set is commonly called a chain. 


Example 1. Let A = P(N). Order A by set inclusion. Thus if S and T are subsets 
of N, then S < T means that S C T. Set inclusion is a partial ordering of A. It is 
not total because if S and T are subsets of N, it need not be the case that TC S 
or SC T. The set {N,, : 2 N}is achain in A. 


Definitions. Let (A, <) be a partially ordered set and let SC A. 


(a) An element s € S is the greatest element of S if, for all t€ S,t <s. Thus s 
exceeds every other element of S. The greatest element of a set, if one exists, 
is unique. 

(b) An element s € S is maximal if, for all t € S,t > s implies that t = s. Thus s 
is not exceeded by any other element of S. A maximal element need not be 
unique. 
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(c) An element a € A is an upper bound of S if s < a for all s € S. Notice that 
an upper bound of S need not be in S. 

(d) An element a € A is the least upper bound of S if a is an upper bound of S 
and, if b < a, then b is not an upper bound of S. 


Example 2. Let A =[R? and define < on A as follows: P = (a,b) < Q= (x,y) if 
a<x and b<y. Thus P is below and to the left of Q. This ordering of R? is 
not linear because the points (1,2) and (2,1) are not comparable. Let S (the 
shaded region in fig. 2.2) be the closed subset of the third quadrant below the 
linex + y+ 1 = 0. The fact that S is closed means that S contains its three straight 
boundaries. The set S has no greatest element, since no point of S is strictly 
above and to the right of every other point of S. Every point on the line segment 
x+y+1=0,-1<.x<0, is a maximal element of S. The set of upper bounds 
of S is the closed first quadrant, and the least upper bound of S is (0,0). @ 


Definition. A linearly ordered set A is said to be well ordered if every nonempty 
subset of A contains a least element. Thus if S is a subset of A, then there is an 
element s € S such that s < ¢ for every t € S. The least element of S is also called 
the first element of S. 


Example 3. N is well ordered with the usual ordering of the real numbers. We 
often use the well ordering of N without explicit mention; see, for example, the 
proof of theorem 2.1.7. @ 


Figure 2.2 
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Example 4. Let A = NU {w}, where w is any object not in N. The ordering on the 
subset N of A is the natural ordering of the integers. We define n < @ for all 
n & N. Thus we simply define to be the largest element of A. The set (A, <) is 
well ordered. @ 


Example 5. Let B = {x,,x,,...} be a countable set of distinct elements such that 
BON = @. Define an ordering < on A = NUB as follows: the restriction of < 
to N is the usual ordering on N and if x, and x,, are in B, x, <x,, ifn <m. 
Finally, if n € N,x € B, we define n < x. The set (A, <) is well ordered. 


We now state the well ordering principle, which is really an axiom. It simply states 
that any set can be well ordered. 


The well ordering principle: given a nonempty set A, there exists a well ordering 
on A. 


It should be clear that a countable set can be well ordered. If {a,,a,,...} is an 
enumeration of a countable set A, then we can well order A ina natural way: define 
An <A ifn < m. Therefore the challenge is when A is uncountable. Notice that an 
arbitrary uncountable set contains an abundance of well-ordered subsets, namely, 
all the countable subsets of A. In order to make the terminology we use in this 
section unambiguous, when we speak of a well-ordered subset of A, we mean a 
subset that can be well ordered. 


The axiom of choice: if {X,},c; is a nonempty collection of nonempty sets, then 
II, Xa is nonempty. 


Recall that an element of |] «Aa isa function x : I> UgeXq such that xg € Xq 
for alla € I. Such a function x is called a choice function because it is constructed 
by choosing an element x, from each of the sets X,, hence the name “Axiom of 
Choice, which we can restate as follows: choice functions exist. Notice that the 
axiom of choice is not needed when the product [| , Xq involves a finite number 
of factor sets. Also when each of the factor sets Xg contains a distinguishable 
element, then one does not need the axiom of choice to assert that [],Xq is 
nonempty. For example, in N, we can pick a distinguishable element, say, 1, and 
NN is not empty because it contains the constant sequence (1, 1,1,...). 


The axiom of choice is often applied in the following equivalent form: if X is a 
nonempty set, it is possible to choose an element from each of the nonempty 
subsets of X. 
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The axiom of choice is perhaps the most believable axiom of set theory. However, 
it is neither a simple fact nor obvious. In fact, the axiom of choice is equivalent to 
the well ordering principle and to the following axiom, which is less intuitive than 
the well ordering principle or the axiom of choice: 


Zorn’s lemma: if A is a partially ordered set such that every chain in A has an 
upper bound, then A contains a maximal element. 


Theorem 2.2.1. The axiom of choice, Zorn’s lemma, and the well ordering principle 
are all equivalent. 


We will include the lengthy proof of the above theorem in Appendix A. 

The rest of the results in this section are well-known theorems and lay the 
foundation for studying the cardinality of sets. The theorem below is a typical 
example of how Zorn’s lemma is applied. 


Theorem 2.2.2. Let A and B be nonempty sets. Then there is an injection from A to 
B or an injection from B to A. 


Proof. Let 8 be the collection of all injective functions f such that Dom(f) CA 
and R( f) CB. Let { fy : a € I} be an indexing of B and, for a € I, write Ag = 
Doni fo), and By = R( fa). Partially order B as follows: fo < fg iffg extends fy. 
More explicitly, fa < fg means that A, C Ag, Bg © Bg, and the restriction of fg to 
Ag is fy. Clearly, < is a partial ordering of 8. Now let © be a chain in 8, and index 
€ by a subset J of I; © ={ fg : a EJ}. We show that € has an upper bound: let 
Ag = Dom( fa); Ba = RU fa), S = VoeAw and T = UgesBy. Define f : S > Tas 
follows: if x € S, choose a set Ag that contains x, and let f(x) = fy(x). The function 
fis well defined because © is a chain. Specifically, ifx € Ag NAg (a,8 € J), then 
fa Sfp or fg < fos say, the former. Since fg extends fa, fa(x) = fg(x). We leave 
it to the reader to verify that f is an injection. Clearly, f is an upper bound of ©. 
By Zorn’s lemma, 8 has a maximal element, say, f,. Write A, = Dom(f,) and 
B, = R(f,). fA, =A, then f, is an injection from A into B. If B, = B, then f,' is 
an injection from B to A, and the proof is complete. We now show that A, #A 
and B, #B cannot occur simultaneously. If that were the case, pick elements 
aGA—A,,beB-—B,, and extend f, to a function f : A, U{a} > B, U{b} by 
defining f(a) = b and f\,, =f. Clearly, f is a strict extension of f,, and this 
contradicts the maximality of f,. 1 


Lemma 2.2.3. Let B be a nonempty subset of a set A. If there is an injection f : 
A—B, then AXB. 
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Proof. Assume, without loss of generality, that R( f) C BCA (strict inclusions). 
Define the powers of f as follows: f(x) = f(x), ft) = fF (0), n > 1. 
Let B’ ={f™(x) : x A—B,n © N}. Note that B’ CR(f) and that f(B') C 
B'. Let C=(A—B)UB’, and let f, be the restriction of f to C. Thus f, is an 
injection from C to B’. We now show that f, is a bijection. If y € B’, then 
y =f (x) for some x € A—B. Ifn =1, then y=f(x),x €A-BCC. Ifn>1, 
then y = f(z),z = fO-D(x) € B' CC. Nowlet D = B— B’. Since B’ CR(f), D= 
B-B' DB-—R(f)#@. The reader can check that B’ and D partition B and 
that the three sets B’,D, and A —B partition A; a simple Venn diagram makes 
this abundantly clear. Thus C and D partition A, and B’ and D partitions B. The 
function h : A > B defined below is a bijection from A to B: 


h(a) = ax) ifxEC, 
x ifxe D. 0 

Example 6. Any two open disks in the plane are equivalent. Let 0< 1, <1, 
and consider the disks D, = {(x, y) € R? : x? +y? < ri} and D, = {(x,y) € R’ : 
x? + y* < 1r5}. Choose a number a such that 0 <a<r,. The function f : D, > 
D, defined by f(x, y) = “(x y) is an injection, as the reader can easily verify. By 
the previous lemma, D, ~ Dj. Ina similar manner, one can prove that any two 
open squares are equivalent. @ 


Theorem 2.2.4 (the Schréder-Bernstein theorem). Let A and B be nonempty sets. 
If there exist injections f : A > Band g: B— A, thenA® B. 


Proof. Let Ay = R(g) CA. The function gof : A —> A, is an injection. By lemma 
2.2.3, there exists a bijection h : A> A. Now g™' is a bijection from A, onto 
B, and the composition g~' oh is the desired bijection from A to B. 


Example 7. An open square is equivalent to an open disk. Let S = {(x,y) : |x| < 
2,|y| < 2}, let D3 ={(x,y) : x? +y? <9}, and let D, ={(x,y) 2 x? +y? < 1} 
Observe that D, CS C D3. Set inclusion is clearly an injection from S into 
D3. By example 6, there is an injection from D; into D,, and hence from D; into 
S. By the Schréder-Bernstein theorem, S + D3. @ 


Exercises 
1. Let A bea partially ordered set and let S C A. State the definition of each of 


the following terms: the least element of S, a minimal element of S, a lower 
bound of S, and the greatest lower bound of S. 
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10. 


11. 


12. 


13. 


14. 


15. 


16. 


. Prove that a nonempty subset of a well-ordered set is well ordered. 
. Let A bea partially ordered set such that every nonempty subset of A has a 


least element. Prove that A is linearly ordered and hence well-ordered. 


. Let (A, <) be a linearly ordered set. Prove that < is a well ordering if and 


only if A does not contain a strictly decreasing sequence a, > a) > a3>... 


. Prove that if every countable subset of a linearly ordered set A is well 


ordered, then A is well ordered. 


Definition. Let A be a linearly ordered set and let a€ A. The initial 
segment of A determined by a is the set S(a) = {x E A : x < ah. 


. Let A be linearly ordered and let a,b € A. Show that S(a) = S(b) if and only 


ifa=b. 


. Prove that if every segment of a linearly ordered set A is well ordered, then 


A is well ordered. 


. Let A be a well-ordered set, and let B be a proper subset of A with the 


property that the conditions b € B and c < b imply that c € B. Prove that 
Bis a segment of A. 


. Let A be a well-ordered set, and let B be a proper subset of A such that, for 


every b € Band for every a € A—B, b < a. Prove that B is a segment of A. 
The principle of transfinite induction. Suppose that A is a well-ordered 
set and that @ # B CA is such that whenever S(x) C B, x € B. Prove that 
B=A. 
Suppose that A is a well-ordered set and that BCA. Prove that either 
UxegS(x) = A, or UxegS(x) is an initial segment of A. 
Prove that there exists an uncountable, well-ordered set O such that every 
initial segment of Q is countable. 
Let Q be as in the previous problem. Prove that every countable subset of 
Q, has an upper bound. 
Give a direct proof of the fact that Zorn’s lemma implies the axiom of 
choice. Hint: Let {X_}q7¢; be a nonempty collection of nonempty sets, and let 
B=(.g9:ICLge [leeXat that is, g is a choice function on {Xqhqey. 
The set B 4 @ because finite subsets J of I generate such functions. Partially 
order B as follows: (J,,2,) < U2,g2) if, C Jo and gy extends g,. 
Let A be a linearly ordered set. Is the union of a collection of well-ordered 
subsets of A necessarily a well-ordered set? 
The Hausdorff maximal principle. Every partially ordered set contains a 
maximal chain, that is, a chain that is not properly contained in any other 
chain. 

Prove that the Hausdorff maximal principle is equivalent to Zorn’s 
lemma. 
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Hint: To prove that the Hausdorff maximal principle implies Zorn’s lemma, 
let (X,<) be a partially ordered set that satisfies the conditions of Zorn’s 
lemma. Let C be a maximal chain, and let x be an upper bound of C. To 
prove the converse, let © be the collection of all chains in X, and order © by 
set inclusion. Verify that the conditions of Zorn’s lemma are met and hence 
© contains a maximal member, that is, a maximal chain in X. 

17. Prove that any open disk is equivalent to any closed disk. 

18. Prove that any open square is equivalent to any closed square. 


2.3 Cardinal Numbers 


In section 2.1 we took a small step toward showing that infinite sets are not created 
equal. In this section, we show that there are infinitely many types of infinities, 
in the sense that there is a whole cascade (loosely speaking) of infinite sets of 
unequal sizes, or cardinalities. This is the first result in the section. Our approach 
to infinite cardinals is intuitive rather than axiomatic. We proceed to show that the 
set of integers is the smallest infinite set, then we prove that a set of infinite sets is 
well ordered by size, or cardinality. Only an intuitive understanding of cardinal 
numbers is essential for subsequent material that make reference to cardinality. 
Thus the discussion of cardinal arithmetic and sums of infinitely many cardinals 
can be omitted on the first reading if the goal is to take the fastest route to chapter 4. 


Definition. Let A and B be nonempty sets. We say that A and B have the same 
cardinality ifA ~ B. We also say that A and B define the same cardinal number, 
and we write Card(A) = Card(B). 


Definition. Let a = Card(A) and b = Card(B). We say that a < b if there is an 
injection from A to B. We also write a < b to mean that there is a injection from 
A to B but that A and B are not equivalent. By the Schroder-Bernstein theorem, 
this is equivalent to saying that there is no injection from B to A. 


Theorem 2.3.1. For any set A, Card(A) < Card(P(A)). 


Proof. Recall that the notation P(A) stands for the power set of A. Define a func- 
tion f : A— P(A) by f(x) = {x}. Clearly, f is one-to-one; therefore, Card(A) < 
Card(P(A)). If Card(A) = Card(P(A)), then there exists a bijection g: A> 
P(A). Define S={x EA : x E g(x)}. Since g is onto, let a be such that g(a) = S. If 
a € S, then, by the definition of S, a ¢ S. Ifa € S, then again, by the definition of 
S, a € S. This contradiction completes the proof. 


The reader may have observed that while our definition of what it means for 
two sets to have the same cardinality is unambiguous, we have not really defined 
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what a cardinal number is. We can give a slightly more tangible definition of a 
set of cardinal numbers as follows: Let G be a set of sets. By theorem 2.1.1, set 
equivalence is an equivalence relation on ©. We can define the cardinal numbers 
in S to be equivalence classes of set equivalence in ©. This does not define all 
cardinal numbers because if A is a set that is not equivalent to any set in G,” then 
Card(A) # Card(S) for allS ES. 


One might be tempted to generalize the idea of the last paragraph by considering 
the set of all sets, instead of a fixed set of sets. However, within the limitations 
of naive set theory, this is paradoxical for the following reason: If we were 
allowed to use terms such as the set of all sets, S, let U=U{S : SE S}. Since 
U contains every S € ©, Card(S) < Card(U). Since G contains all sets, Card(U) 
would be the largest cardinal number. This is a paradox because, by theorem 2.3.1, 
Card(P(U)) > Card(U). Such paradoxes can be avoided in an axiomatic treatment 
of set theory. Such a treatment is hardly essential for our purposes because we will 
never refer to cardinal numbers as an absolute concept. We will be content to 
think of cardinal numbers as a comparative measure of the size of sets in the sense 
of the opening definition of this section. 


Some common cardinals are: 


n= Card(N,) 
XN. = Card(N) 
c = Card(R) 


The natural numbers are the finite cardinals, and all other cardinals are infinite. 
Theorem 2.3.2. Ng is the smallest infinite cardinal number. 


Proof. Let A be an infinite set. By theorem 2.1.4, A contains a countable set 
of distinct elements, B = {b,,b»,...}. Since the inclusion BC A is an injection, 
XN. = Card(B) < Card(A). @ 


Theorem 2.3.3. Let © be a set of cardinal numbers. Then G is linearly ordered. 


Proof. Leta,b € © and let A and B be sets such that a = Card(A), and b = Card(B). 
By theorem 2.2.2, there is an injection from A to B or one from B to A. Thus 
a<b or b<a. To check antisymmetry, suppose that a<b<a. Then is an 
injection from A to B and one from B to A. By the Schréder-Bernstein theorem, 
Aw Banda=b.0 


? Such a set A exists. One can take A to be the power set of U{S : S € G}. 


SET THEORY 41 


The following theorem establishes the fact that any set of cardinal numbers is well 
ordered. 


Theorem 2.3.4. IfS = {fa}; is a set of cardinal numbers, then there is an element 
a € I such that &y, < & for alla € I. 


Proof. Let {Xq}qe, be sets such that §4 = Card(Xq). If S contains any integers, the 
smallest of these integers is the least cardinal in ©. Otherwise, all the sets Xq 
are infinite. We prove that there is a) EI such that, for every a € I, there is an 
injection fy : Xq, > Xq. 

LetX = |] ¢;Xq, and let B be the collection of subsets B of X with the property 
that if x = (xq) and y = (yq,) are distinct elements of B, then xy # Yq for alla € I. 
Order B by set inclusion. It is clear that if © is a chain in B, then © has an upper 
bound, namely, U{C : C € CG}. By Zorn’ lemma, 8B has a maximal member, B. 
We claim that, for some a € I, 1q,(B) = Xq,. If this is not the case, then, for 
each a € I, choose an element ay € Xq — 7_(B) and let a = (a,). The set BU {a} 
is clearly in 88, which contradicts the maximality of B and shows that, for some 
Oy) EL, Zq,(B) = Xap: 

Now, for each a € Xe, there is a unique element x € B such that Key (x) =a. 
Such x exists because 7q, is onto, and it is unique by the definition of B and the 
fact that B € B. Now define fy : Xq, > Xq as follows: fy(a) = q(x), where x is 
the element of B constructed above.’ By construction, fz is an injection. 


Cardinal Arithmetic 


Definition. Let A and B be disjoint sets, and write a = Card(A), b = Card(B). 
By definition: 


the sum of a and b: a+ b = Card(AU B) 

the product of a and b: ab = Card(A x B) 

exponentiation: a’ = Card(A®); recall that A? is the set of all functions B > A 
The above operations are well defined in the sense that they are independent of 
the particular sets A and B chosen to represent a and b. For example, if A + C and 


Bx D, then AX Bx CxDand A? = CY. See the exercises on section 2.1. 


Example 1. For any cardinal number a, a < 2°. 


> In fact, fe(a) = Tea ( 7g) (a)). 
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By theorem 2.3.1 and problem 16 on section 1.1, a<Card(P(A))= 
Card(24) = 2°. @ 


Example 2. Let a and b be cardinal numbers. Then a < b if and only if there is a 
cardinal number c such that a+c= b. 


Suppose b = a+c and let A, B, and Cbe sets such that Card(A) = a, Card(B) = b, 
and Card(C) = c, and suppose that AM C = @. By assumption, there is a bijection 
f: AUC = B. The restriction of f to A injects A into B. Hence a < b. Conversely, 
if a < b, let A and B be as above, let f : A > B be an injection and assume that 
ANB=@. Define C= B-—f(A). The function g: AUC > B defined below is 
easily seen to be a bijection: 


_\f) ifxeA, 
~ x ifxEC. 


g(x) 


If c = Card(C) then, by definition, a+c=b.¢ 
Theorem 2.3.5. Let a,b, and c be cardinal numbers. Then 


a+b=b+aand ab= ba. 

. a+(b+c)=(at+b)+c and a(bc) = (ab)ec. 
a(b+c)=ab+ac. 

aval = abt, 

acb¢ = (ab). 

(aby = ak, 

. Ifa<b,thenat+c<b+ce. 

. Ifa <b, and c > 1, then ac < be. 


SNA AWN 


Proof. Most of the rules of cardinal arithmetic are obvious. We prove property 6, 
as an example. Let A,B, and C be such that a= Card(A),b = Card(B) and 
c = Card(C). We need to show that (AB) is equivalent to A®*°, Let fe Com 
Then for c € CG, f(c) is a function from B to A. We write f, instead of f(c). For such 
an f, define a function $(f) =g: BXC—A by g(b,c) = f(b). The assignment 
$ : ft gmaps (AB)° to ABXC, 


¢ is onto: Ifg : BX C > A, define f : C > A® byf.(b) = g(b, c). Clearly g = $(f). 
g is one-to-one: Let f and f’ be in (AB) be such that f#f'. Then there is 


c €C such that f, #f'.. Thus there is b © B such that f(b) #f' (b). Now if 
g=9¢(f),2 = $C’), then o(b,c) = f(b) #f'(b) = 2'(b,c). 
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Example 3. Let a,b,c, and d be cardinal numbers such that a < b and c < d. Then 
ac < bd. 


By example 2, there are cardinal numbers r and s such that a+r=b, and 
ct+ts=d. Now bd=(a+r)(c+s)=ac+(ast+rct+rs). Again by example 2, 
ac < bd. 4 


Example 4. If a,b, and c are cardinal numbers and a < b, then a‘ < Db’. 


Let A,B, and C be such that Card(A) = a, Card(B) = b, and Card(C) =c. By 
assumption, there is an injection g : A > B. Define a function ¢ : AC > B® by 
$(f ) = gof. The function ¢ is an injection; hence a° < b°. 4 


Theorem 2.3.6. Let a be an infinite cardinal. Then a.a = a. 


Proof. Let A be such that Card(A) = a, and let 8 = {(Ag,fa)}aer be the collection 
of all bijections fy : Ag > Ag XAg, where Ag CA. To see that 8B # @, pick a 
countable subset G of A. By theorem 2.1.6, GX GX G; hence B # ©. 

Order 8 as follows: for a and B EI, (Agfa) < (Ag. fp) if Aa G Ag, and fg 
extends fy. If © ={(Ag. faaey is a chain in B, let C= UgesAq and define a 
function f : C> CXC by f(x) =fa(x), where a €J is such that x € Ay. The 
function f is a well-defined bijection from C > CXC, as the reader can verify. 
Clearly, (C,f) extends every member in © and hence is an upper bound of 
©. By Zorn’s lemma, 8 contains a maximal member, (C,g). We claim that 
Card(C) =a. Suppose for a contradiction that Card(C) = b < a. First observe 
that b<b+b <b.b = Card(C x C) = Card(C) = b, and hence b+b=b.b= b. 
Now let d = Card(A — C). Ifd < b, thena=b+d<b+b=b, which contradicts 
the supposition that b < a. Therefore b < d, and A—C contains a subset E such 
that Card(E) = b. 

Now (CUE) X (CUE) = (Cx C) UK, where K = (CX E)U(EXC) U(EX E). 
Since K is the disjoint union of three sets each of cardinality b,Card(K) = b+ 
b+b=(b+b)+b=b+b=b. Therefore there is an bijection hh: E— K. Now 
define a function f: CUE > (CX C)UK = (CUE) X (CUE) by 


ox) ifxec, 


FOV) fe CR. 


Clearly, the pair (CUE, f) € B is a strict extension of (C,g), which contradicts the 
maximality of (C,g). This shows that the supposition b < a is false; hence a= b. 
This concludes the proof because a.a = Card(C x C) = Card(C) = a. 


Corollary 2.3.7. If ais an infinite cardinal number and 1 < b <a, then ab =a. 
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Proof. a<ab<aa=a.8 

Theorem 2.3.8. Let a be an infinite cardinal. Thena+a=a. 

Proof a<a+a<a.a=a. Thereforeat+a=a. li 

Corollary 2.3.9. If ais an infinite cardinal and 1 <b <a, thena+b=a 
Proof axa+b<ata=all 


Example 5. Let b be an infinite cardinal number, and suppose that 1<a<b. 
Then a’ = 2°. 


Since a < 24%, a’ < (27) = 2% — 25 Because 2 < a, 2° <a", and 2" =a". 


We conclude our study of cardinal arithmetic with a brief exploration of sums of 
infinitely many cardinals. 


Definition. Let {a,},<; be a set of cardinal numbers, and let {A,} be a collection 
of disjoint sets such that Card(A,) = ag. Define ae Ag = Card(UgerAq). 


Theorem 2.3.10. Let {aq}qe, be a collection of equal cardinal numbers, say, ag =a 
and let b = Card(I). Then >), 4q = ab. 


Proof. Let A be such that Card(A) =a, and let {Aq} be a collection of disjoint 
sets such that Card(Az) = a. Then there are bijections fy : A — Ag. Define a 
function f : AXI> UgesAg by f(x, a) = fa(x). Verifying that f is a bijection is 
straightforward. Therefore >) y<,4q = Card(UgetAg) = Card(A x I) = ab. 


Theorem 2.3.11. Ifag < bg for every a EI, then») 24a S Dive? 


ae] & 


Proof. Let {Aq} be a collection of disjoint sets such that Card(Ag,) = aq, and let 
{By} be a collection of disjoint sets such that Card(By) = bg. By assumption, 
there exist injections fy : Ag > By. Define a function f : UgetAg > UaerBa by 
f(x) =falx) if x € Ag. The function f is well defined because {Aq} is a disjoint 
family. Clearly, fis an injection from UgerAg into Uge;By. 


The following theorem is a far-reaching generalization of theorem 2.1.12: 


Theorem 2.3.12. Let I be an infinite set, and let b = Card(I). Then the family & of 
finite sequences in I has cardinality b. 
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Proof. Let I" be the family of sequences in I of length exactly n. Since I" = 
[2-9 Card(I") = b" =b. Now § =US1I", so Card(¥) = Y, Card(!"). 
By theorem 2.3.10, ee, Card(I") =X ob = b. Hl 


We conclude the section with a well-known theorem and a famous conjecture. 
Theorem 2.3.13. 2% =. 


Proof. By definition, 28° = Card(2'). Let T be the set of all binary sequences that 
contain only a finite number of nonzero terms. By problem 6 on section 2.1, 
Card(2“ — T) = Card(2) = 2%. By the proof of theorem 2.1.15 (see also prob- 
lem 11 on section 2.1), 28 — T = (0,1] & R. Thus ¢ = Card(R) = Card((0,1]) = 
Card(2\ — T) = 2®>. 


Example 6. cX = ¢. 
Using theorems 2.3.6 and 2.3.13, CX = (2%0)X? = 2Xo&o = 2% = ¢. @ 


Example 7. There exits a sequence of cardinal numbers aj,qa),... such that 
R 
a, <a), <...anda,° =a,. 


Take a, = c. By the previous example, ae = a,. Forn > 1, define a,,, = 2%. @ 


The Continuum Hypothesis 


Take a sufficiently large infinite cardinal such as a = 2 There are several infinite 
cardinals between Ny and a, such as 2%0, and 22 Consider the set of cardinals 
strictly between Ny and a. By theorem 2.3.4, there is a smallest cardinal in that set. 
We call it &,. Thus &, is the immediate successor of XN, in the sense that there are 
no cardinals strictly between No and Xj. 

We know that 2% > No. By the above paragraph, &, < 2%». A famous conjecture 
of set theory is the continuum hypothesis, which states that 2%° = Xj. In other 
words, there are no cardinals strictly between Np and 2&0 =¢, 


The generalized continuum hypothesis states that, for any infinite cardinal a, 


there is no cardinal number b such that a < b < 2%, that is, 2% is the immediate 
successor of a. 


Exercises 


1. Provide proofs of the statements of theorem 2.3.5. 
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. Prove that if a > 2 is acardinal number, then a + a < a.a. This result was used 


in the proof of theorems 2.3.6 and 2.3.8. 


. Let a and b be infinite cardinal numbers. Prove that ifa+a=a+tb, then 


a>b. 


. Let a,b, and c be infinite cardinal numbers. Prove that if a+ b <a+c, then 


b<c. 


. What is Nh” 2 
. Prove that SS N=. 
. Let A and B be infinite sets and let f : A > B be a surjection. 


(a) Prove that Card(A) > Card(B). 
(b) Prove that if f~'(b) is at most countable for each b € B, then A ~ B. 


. Let {Aghaer be a family of nonempty sets. Prove that Card(Uge;Aq) < 


Der COUAg). 


3 


Vector Spaces 


Questions that pertain to the foundations of mathematics, although treated by 
many in recent times, still lack a satisfactory solution. Ambiguity of language is 
philosophy’s main source of problems. That is why it is of the utmost importance 
to examine attentively the very words we use. 

Giuseppe Peano 


Giuseppe Peano. 1858-1932 


Peano was born in a farmhouse about 5 km from Cuneo, where he received his 
early education. One of Peano’s uncles was a priest and a lawyer in Turin, and 
he realized the child’s talent. He took him to Turin in 1870 for his secondary 
schooling. Peano entered the University of Turin in 1876, graduated in 1880 
doctor of mathematics, and was appointed to the university the same year. He 
received his qualification to be a university professor in 1884. 


In 1886 Peano proved the existence of the solution of the differential equation 
dy/dx = f(x,y) under the mere assumption that f is continuous in the neigh- 
borhood of the initial point (xo, yp). In 1888 he published the book Geometrical 
Calculus, which begins with a chapter on mathematical logic. A significant feature 
of the book is that, in it, Peano sets out with great clarity the ideas of Grassmann, 
who made the first attempt to define a vector space, albeit in a rather obscure way. 


Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules. 
DOI: 10.1093/0s0/97801 98868781 .003.0003 
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This book contains the first definition of a vector space given with a remarkably 
modern notation and style. This was, without a doubt, a big development in the 
history of mathematics. In 1889 Peano published an axiomatic approach to the 
definition of the natural numbers that was based on the notion of the successor 
function. In 1890 he made the stunning discovery that there are continuous 
surjective mappings from [0,1] onto the unit square, which came to be known as 
space-filling curves. 


Peano’s career was strangely divided into two periods. The period up to 1900 
is one where he showed great originality and a remarkable feel for topics that 
would be important in the development of mathematics. His achievements were 
outstanding, and he had a modern style quite ahead of his own time. However, 
this feel for what was important seemed to leave him, and, after 1900, he worked 
with great enthusiasm on two projects of great difficulty, which were enormous 
undertakings but proved quite unimportant in the development of mathematics. 

From around 1892, Peano embarked on a new and extremely ambitious project, 
namely, the Formulario Mathematico. As he explained:’ 


of the greatest usefulness would be the publication of collections of all 
the theorems now known that refer to given branches of the mathematical 
sciences ...Such a collection, which would be long and difficult in ordinary 
language, is made noticeably easier by using the notation of mathemati- 
cal logic 


Even before the Formulario Mathematico project was completed, Peano took up the 
project of finding an international, artificial language, “Latino sine flexione,” which 
was based on Latin but stripped of all grammar. He compiled the vocabulary by 
taking words from English, French, German, and Latin. In fact, the final edition of 
the Formulario Mathematico was written in Latino sine flexione, which is another 
reason the work was so little used. 


The Evolution of the Concept of a Vector Space” 

The emergence of the modern definition of a vector space was delayed for a con- 
siderable length of time because of several reasons. It appears that early attempts 
to define what we know now as a vector space were hindered by the insistence 
on incorporating axioms for determinants. The lack of awareness of the impor- 
tance of axiomatics and abstract thinking was also a major obstacle. Grassmann’s 


‘J. J. OConnor and E. FE Robertson, “Giuseppe Peano,” in MacTutor History of Mathematics, (St 
Andrews: University of St Andrews, 1998), http://mathshistory.st-andrews.ac.uk/Biographies/Peano/, 
accessed Nov. 3, 2020. 

? All the historical information in this article can be found in Jean-Luc Dorier, “A general outline of 
the genesis of vector space theory.’ Historia Mathematica 22, no. 3 (1995): 227-61. 
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pioneering ideas were obscured by philosophical language, and although Peano’ 
definition was a long step toward axiomatization, it did not produce the modern 
definition. The founders of functional analysis were instrumental in framing the 
modern definition. In 1916 Riesz studied spaces of continuous functions and 
defined linear transformations and even the concept of bounded linear operators. 
Decisive steps toward axiomatization were taken independently by Banach in 1920 
and by Hahn, in two papers published in 1922 and 1927. In 1920 Banach took 
Riesz’s ideas one step further and defined what is known in modern terminology as 
a Banach space. The function spaces Banach and Riesz studied are infinite dimen- 
sional, and this makes the use of an axiomatic approach compulsory. Banach’s 
approach was confined to function spaces, and his axioms did not coincide with 
the modern definition in that some axioms are redundant and some are missing. 
Modern algebra finally paved the way toward the modern definition of a vector 
space: determinants were dropped from the axiomatic approach, and this unified 
the definition of finite and infinite-dimensional spaces. The definition was made 
accessible to beginning students in books that were published in the 1940s by 
Birkhoff and MacLane, Halmos, and Bourbaki. 


3.1 Definitions and Basic Properties 


This section is a summary of the most basic concepts of vector space theory. The 
main reason for including this section is to establish terminology and provide a 
collection of important examples. The reader should pay particular attention to 
the examples, because the sequence and function spaces we introduce here are of 
fundamental importance for the rest of the book. The theorems are stated without 
proof. 


Definition. Let K be a field, and suppose U is a nonempty set equipped with a 
binary operation, + (vector addition). Suppose also that there is a function 
x: Kx U-— U (scalar multiplication) that assigns to each pair (a,u) € Kx U 
an element a X u (or simply au) in U. The triple (U, +, x) is called a vector space 
over the field IX if the following conditions are satisfied by all elements a, b € IK 
and all elements u,v, w € U: 


(a)ut+v=vt+u. 

(b)u+(v+w)=(u+v)+w. 

(c) There is an element 0 € V (the zero vector) such that u+ 0 = u. 

(d) For every u € U, there is an element —u € U such that u+(—u) = 0. 
(e) aut v)=autayv. 

(f) (a+ b)u = aut bu. 

(g) (ab)u = a(bu). 

(h) lu=u. 
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The field K is called the base field, the elements of K are referred to as scalars, and 
the elements in U are called vectors. The only two fields we will use in this book are 
the real field, R, and the complex field, C. Either of these two fields will be denoted 
by IK. Most of the results we will obtain apply equally whether the underlying field 
is R or C. When a given result applies to only one field but not the other, we will 
explicitly state the base field. 


Example 1. For each n EN, let IK” be the set of sequences in K of length n. The 
set [X” is a vector space with the operations 


(X15 XA Vn) = LAI 0 Xp FIV n), A(X,» Xp) = (AX, ...,Xy). 


Example 2. Let J be a nonempty set, and let K! be the space of all functions from 
I to K. For functions x = (xg )aeny = Wadeep and for a € K, define 


X+Y = (Xe +Valaep AX = (AXq)aer 


Example 3. The space (J) is the space of all functions x : I KK such that xg = 0 
for all but a finite number of elements a € I. Addition and scalar multiplication 
are defined as in example 2. @ 


Example 4. Important special cases of examples 2 and 3 are obtained when I = N; 
kK is the space of all sequences in K, and IK(N) is the space of sequences that 
have a finite number of terms different from 0. @ 


Example 5. Let P be the set of all polynomials with coefficients in K. We 
add polynomials by adding the coefficients of equal powers of x, and scalar 
multiplication is defined by ean a;x') = Dyno (4ai)x!. Let P,, be the space of 
polynomials of degree < n. Clearly, P,, is contained in P for all n EN. In fact, 
P=UR P,.¢ 


Example 6. Let K,,,, be the space of all mxn matrices. Addition and scalar 
multiplication are defined entrywise, in the usual manner. 


Example 7. For real numbers a < b, define X = B[a, b] as the space of all bounded 
(real or complex) functions on the interval [a,b]. For fg © X,x € [a,b], and 
c €K, define vector addition and scalar multiplication in X, respectively, by 


(f+ g(x) = f(x) + o(x), 
(cf)(x) = f(x). 
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Example 8. Another important example is the set C[a, b] of continuous functions 
on the closed bounded interval [a, b]. Vector addition and scalar multiplication 
are defined as in the previous example. Because continuous functions are closed 
under addition and scalar multiplication (the sum of two continuous functions 
is continuous, etc.), C[a, b] is a vector space. 


Example 9. The space C®(IR) consists of all real-valued functions on R that have 
derivatives of all orders. Vector addition and scalar multiplication are defined 
as in example 7. @ 


Theorem 3.1.1. 
(a) The zero vector is unique. 
(b) Foru € Uanda € K, 0.u = O and a.O = O (0 is the scalar zero and O is the 
zero vector). 
(c) (—a)u = a(—u) = —(au). 
(d) (Sin, GU = Din i 
(e) ay, uj) = at au;. 


Definition. A subset V of a vector space U is called a subspace of U if V is closed 
under vector addition and scalar multiplication. Thus V is a subspace if, for all 
v,w € V,andalla € K,v+ we V,and av € V.Itis clear that Vis a vector space 
in its own right. 


Example 10. The set V= {(x,,x,0) : x,,x, € K} isa subspace of K?. @ 


Example 11. More generally, for n < m, IK” can be viewed as a subspace of K” if 
we identify an element (x), ...,x,) € IK" with the element (x), ...,x,,0,...,0) € 
K".¢ 


Example 12. For every n EN, P,, is a subspace of P. 


Example 13. For an arbitrary nonempty set I, K(J) is a subspace of K’. In 
particular, K(N) is a subspace of KN. A particularly important subspace of KN 
is the space of bounded sequences, 


I? = {(X1,%,...) | sup, |x,| < co}. > 
Example 14. Two well-known subspaces of /* are the space c of convergent 


sequences, and the space co of all sequences that converge to 0. We also call 
Cy the space of null sequences. 


Example 15. The space C[a,b] is a subspace of Bla,b] because a continuous 
function on a closed bounded interval is bounded. 
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Theorem 3.1.2. A subset V of a vector space U is a subspace of U if and only if for 
allv,w € V, and alla,b EK, av+bwe V. 


Definitions. The Canonical Vectors: 
(a) The canonical vectors in K” are the n vectors 
e, = (1,0,...,0),e, = (0,1,0,...,0),..., ande, =(0,0,...,1). 
(b) The canonical vectors in K(N) are the sequences e,,(n € N), where the nth 
term of e, is one, and all the other terms are zero. 


(c) The canonical vectors in KC) are the functions e, : I K, defined by 
eg(B) = 5q,g- Here dg,g is the Kronecker delta: 


1 ifa=8, 
5a,8 = . 
0 ifaFxf. 
Definition. Let {u,,u,...,u,} be a finite subset of a vector space U. A linear 
. . : n 
combination of 1, “5, ...,u, is an element of U of the form >) _ , a;u; for some 


scalars a), ...d). 


Example 16. Every vector in IX” is a linear combination of the canonical vectors 
in K”. Indeed, if x = (x,, ...,x,) € K”, then x = pan xe; 


Example 17. In K(N), every vector is a linear combination of a finite number of 
the canonical vectors, because if f € KK(N), and a; = f(k,), ...,a@, = f(k,) are all 
the nonzero terms of f, then f= ae ajey,. © 


Example 18. Every polynomial in P,, is a linear combination of the n + 1 vectors 
1,x,x7,...,x07.@ 


Definition. Let S be a subset of a vector space U. The span of S, written Span(S), 
is the collection of all linear combinations of finite subsets of S (common termi- 
nology: finite linear combinations of S). To reiterate, a finite linear combination 
of S is a vector in U of the form pe aju;, where u; € S and a; € K. 


Example 19. In K’, Span({e;,e., setulat) = K”. In K(N), Span({e,,e,}) is the set 
of all sequences where only the first two terms may be nonzero. @ 


It is easy of see that Span(S) is a subspace of U. If Vis a subspace of U that contains 
S, then V contains all finite linear combinations of S. Thus Span(S) C V, hence the 
following result. 
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Theorem 3.1.3. Span(S) is the smallest subspace of U containing S. 


Theorem 3.1.4. If {Vq} is a collection of subspaces of U, then NgVq is a subspace 
of U. Hl 


Theorem 3.1.5. Span(S) is the intersection of all the subspaces containing S. 


Exercises 


. Prove theorem 3.1.1. 

. Prove theorem 3.1.2. 

. Prove theorem 3.1.4. 

. Prove theorem 3.1.5. 

. Let S; and S, be subsets of a vector space U. Prove that Span(S, US) = 
Span(S,) if and only if S, € Span(S,). 


aA WN 


3.2 Independent Sets and Bases 


This section is focused on the concepts on linear independence and bases. Our 
approach to studying bases is unified in the sense that we do not treat finite- 
dimensional and infinite-dimensional spaces separately. We use Zorn’s lemma to 
prove the existence of a basis. A number of important equivalent characterizations 
of a basis are also discussed, both in the body of the section as well as in the section 
exercises. 


Definition. A finite subset {u,,u, ...,u,,} of a vector space U is dependent if there 
exist scalars a,,d5, ...,a,, not all zero, such that er aju; = 0. 


Terminology. A vector of the form aju;, where at least one a; 4 0, is called 


a nontrivial linear combination of u,,u5,...,u,,. The above definition can be 
restated as follows: {u,,U5,...,u,} is dependent if some nontrivial linear combi- 
nation of u,,U,...,U, is zero. 


Theorem 3.2.1. A subset S = {u,, Up, ...,u,} of a vector space U is dependent if and 
only if one of the vectors in S is a linear combination of the remaining vectors. 


Proof. Suppose {uy,Us,...,U,} is dependent. Then ye =H0 for scalars 
@1,45,...,a,, not all zero. Say a; #0. Then u;= ei Conversely, if 
Gj 


uj = Dei ep then 1.u;— Dei thi = 0, and {u,, Uy, ...,u,} is dependent. 
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Definition. A finite subset {u,,u5,...,u,,} of a vector space U is independent 
if it is not dependent. Equivalently, {u,,u2,...,u,} is independent if a linear 
combination ye aju; is equal to zero if and only if each a; = 0. 

Example 1. The set {e), ...,e,,} is independent in K”. @ 

Example 2. Any finite subset of {e,, : n € N} is independent in K(N). @ 

Example 3. Any finite subset of the monomials 1, x, x’, ... is independent in P. @ 

Example 4. Any finite subset of the canonical vectors e, in K(J) is independent. @ 

Definitions. An infinite subset S of a vector space U is independent if every finite 
subset of S is independent. An infinite subset S of vectors is dependent if some 
finite subset of S is dependent. 

The following follow immediately from the previous set of examples. 

Example 5. The set of canonical vectors {e, : n € N} is independent in K(N). @ 

Example 6. The set of all monomials {1,x,x”, ...} is independent in P. @ 


Example 7. The set of canonical vectors {e, : a € I} is independent in K()). 


Example 8. The functions fy(x) = e**,a € R, are independent in C*(R). 


We show that, for any finite set {a,,...,@,,} of distinct real numbers, the 
functions e'*, ...,e%* are independent. Suppose that, for constants ¢),...,¢, 
yy ce“* = 0. Repeated differentiation of the above identity (n —1 times) 
yields YS atlcei* = 0, 7 =0,...,n—1. Evaluating each of the last identities 
at x = 0 yields the system of linear equations 


n 
>) aie; = 0, j= 0,....0-1. 
i=1 


The matrix of the system is the famous Vandermonde matrix, 
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Since det(J) = Thsiejenl% — a;) #0, we must have c, = ... = c,, = 0, establish- 
ing the independence of the set { fo(x) =e“ :aER}. ¢ 


Definition. Let Ube a vector space. A basis for Uis a maximal independent subset 
of U. To rephrase, S is a basis if S is independent and any subset of U properly 
containing S is dependent. A basis for a vector space is sometimes called a linear 
basis or a Hamel basis. 


Theorem 3.2.2. Every vector space U has a basis. 


Proof. Let 8 be the collection of all independent subsets of U. Order 8 by set 
inclusion, and let © be a chain in 8B. We show that U{C : C € GC} is independent. 
Let {u,, Uz, ...,u,} be a subset of ULC : C € CG}. Then, for each 1 <i< n, there is 
a member C; € © such that u; € C;. Because © is a chain, one set C; contains all 
the other sets Cals pan, Therefore {u,, Uy, ...,U,} is a subset of C; and hence 
is independent. Thus © has an upper bound, namely, U{C : C € CG}. By Zorn’s 
lemma, 8 has a maximal member, that is, a maximal independent subset of U, 
that is, a basis for U. @ 


The corollary below says that an independent subset of a vector space can be 
augmented to a basis. 


Corollary 3.2.3. Let S, be an independent subset of a vector space U. Then there is 
a basis S for U containing S,. 


Proof. The proof parallels that of theorem 3.2.2, except that 8B is defined to be the 
family of all independent subsets of U that contain S,; B # @ because S, € B. Hl 


Theorem 3.2.4. Let S be a subset of a vector space U. The following are equivalent: 


(a) Sis a basis. 

(b) Sis independent and spans U (meaning that Span(S) = U). 

(c) Every nonzero element of U can be written uniquely as a finite linear combi- 
nation of vectors in S. Specifically, if u # 0, then there exists a unique subset 
{u,,...,U,} of S and a unique set of nonzero scalars {a,, ...,a,} such that 


Proof. (a) implies (b). Since a basis is independent, we only need to show that 
Span(S) = U. Let u € U and, without loss of generality, assume that u € S. Then 
S, = SU {u} is dependent, so a finite subset S, of S, is dependent. S, must contain 
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u because the other elements of S, are independent. Write S, = {u, u,,...,U,,}. Then 
there are scalars a,a,, ...,a,, not all zero, such that au+a,u, +... +a,U, = 0; 
a # 0 because otherwise, {u,,...,U,} would be dependent. Hence u= *(ayuy + 
. +4,U,). Thus S spans U. : 


(b) implies (c). We only need to show the uniqueness of the representation of a 
nonzero element u € U as a finite linear combination of S. Suppose there are 
finite subsets E and F of S such that u can be written as a linear combination 
of the elements of both E and F. We will show that E # F leads to a contradiction. 
We adopt the notation ENF ={uy,...,u,}, E-F = {u,41,...,u,} and F—E= 
{Us41, ---,U,}. The assumption is that there are nonzero scalars a,,...,b,,..., such 
that 


¥ s r n 
u= ait; + > aju; = yy biMi + > bju;. 
i=1 i=rt+1 i=1 i=st+1 
Rearranging the above equation, we have Pad —b;)u; + ys au; — 
pee bju; = 0. is would contradict the independence of EU F unless E— F = 
@ = F-E. Now ),_,(a; — b;)u; =0, and the independence of E forces a; = b; 
foralll <i<r. 


(c) implies (a) First observe that the zero vector is not in S because otherwise 
the uniqueness of representation of any finite linear combination of S would be 
violated by adding 1.0 to it. To show the independence of S, suppose a linear 
combination of some finite subset of S is equal to zero, say, Ss amy =0. 
By the previous observation, at least two of the coefficients are nonzero, say, 
a, #044). In this case, ayu, = =) 5 
of the representation of a,u, and proves the independence of S. To show that 
S is maximal, let u€ U—S. Then u is a finite linear combination of elements 
Uy, ...,U, of S. This implies that {u, u,, ...,u,} is dependent, and hence {u} US is 
dependent. This establishes the maximality of S. 


aju;. This contradicts the uniqueness 


Example 9. The canonical vectors e,,...,e, are independent and span K” and 
therefore form a basis for KK". @ 


Example 10. The set S = {1,x,x’,...} is independent and spans P and is therefore 
a basis for P. Naturally, we call S the canonical basis for P. @ 


Example 11. For the same reason, {e, : n € N} is a basis for K(N). @ 


Theorem 3.2.5. If S is a basis for a vector space U, then S is a minimal spanning 
set for U in the sense that Span(S) = U and no proper subset of S spans U. 
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Proof. If S is a basis for U, then, by the previous theorem, S is independent and 
Span(S) = U. If S; is a proper subset of S that also spans U, then, again by the 
previous theorem, S, would also be a basis for U. This contradicts the maximality 


of S,; and hence the very definition of a basis, because S, is a proper subset 
of S. i 


Exercises 


1. Prove that a subset of an independent set is independent. 

2. Prove that a set containing a dependent set is dependent. 

3. Prove that a minimal spanning subset of a vector space U is a basis for U. 
This is the converse of theorem 3.2.5. 

4. Prove that every spanning subset of a vector space U contains a basis for U. 


5. Find a basis for K,,.¢y- 
3.3 The Dimension of a Vector Space 
In this section, we discuss the definition of dimension and prove the invariance 
of the cardinality of the basis. Some results on cardinal arithmetic are needed in 
the infinite-dimensional case. We also prove the existence of a vector space of any 


given dimension. 


Definition. A vector space U is said to be finite dimensional if it contains a finite 
basis. 


Example 1. K" and P,, are finite dimensional. @ 


Lemma 3.3.1. Consider the following system of linear equations with coefficients 
in K: 


A,X, + Ay2X_ +... FAY Xm = 9, 


Ag 1X1 + Ag Xy +. F Any Xp = 90, 


Ani X1 + AyrX2 +e F Any Xp = 0. 


If m>n, then the system has a nontrivial (i.e, nonzero) solution 
(x1, --+5X%m) EK”. 
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Proof. Without loss of generality, assume that m = n+ 1, because we can augment 
the system by adding m — n — | equations with zero coefficients to the system. 

Since at least one of the coefficients is different from zero, we may assume, 

by reordering the equations and renumbering the variables, that a,, #0. We 


prove the theorem by induction on n. Subtracting ““ times the top equation from 
a4 
equation i,2 <i<n yields the equivalent system 


Ay X1 + Ay2Xy +... + Ay nti Xn41 = 9, 


by5X> sae Do nt 1Xnt1 = 0, 


binX2 +..4+ DantiXntl = 0, 


where bj, = ay — a),4);/a,,,2 <i <n,2 <j <n+1. The bottom n—1 equations 


of the above system have a nontrivial solution (x3, ...,Xy41), by the inductive 
e - = 1 
hypothesis. Definingx, = = peas 


ai, ee 
of the original system. 


a, ;x; yields a nontrivial solution (x), ...,Xn41) 


Lemma 3.3.2. If a finite dimensional space U has a basis S = {u,,...,u,} of n 
vectors, then any subset of U containing more than n elements is dependent. 


Proof. Let {v,, ...,Vm} be a subset of U with m > n. Each v; is a linear combination 
of S, say, vj = in Mt (L <j<m). By the previous theorem, there exists a 
nontrivial solution (x1, ...,X,,) of the system De a,x; =0,i=1,2,...,n. Now 


m m n n m n 
bi 4% = dja Xi De aij = aS jx; )Uj oS ee 0.4; = 0. 
Thus {v,,...,Vm} are dependent. @ 


We now prove the invariance of the number of vectors in a basis for a finite- 
dimensional space. 


Theorem 3.3.3. If S={u,,...,u,} and T={v,...,V,,$ are bases for a finite- 
dimensional vector space U, then n =m. 


Proof. Since S is independent and T is a basis, n < m, by the previous lemma. For 
the same reason, m <n. 


Definition. The dimension ofa finite-dimensional vector space U is the number 
of elements in a basis for U. This number is independent of the basis by the 
previous theorem. 
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Example 2. The dimension of KK” is n. The dimension of P,, isn + 1. The dimen- 
sion of K,,., is mn. 


Definition. A vector space U is said to be infinite dimensional if it is not finite 
dimensional. Thus U is infinite dimensional if every basis for U is infinite. 


As in the finite-dimensional case, the cardinality of a basis for an infinite- 
dimensional space is an invariant of the space, as the following theorem shows. 


Theorem 3.3.4. Let {ug sae and {vg }gey be bases for an infinite-dimensional space 
U. Then Card(I) = Card(). 


Proof. For each B € J, there is a finite subset Ig C I such that vg is a linear combina- 
tion of the finite set {ug : a € Ig}. Therefore 


U=Span({vg : B €J}) € Span({ug : & € Ugejlg}) C U. 


Since no proper subset of {Ug}qer spans U (theorem 3.2.5), I= Ugejlg. Using 
theorems 2.3.11, and 2.3.10 (also see problem 8 on section 2.3), 


Card(I) = Card(Ugeylg) < >) Card(Ig) < >) Xo = NoCard()) = Card()). 
Be] Be] 


Likewise, Card) < Card(1), and the proof is complete. Mi 


Now that we proved the invariance of the cardinality of a basis in an infinite- 
dimensional space, we define the dimension of such a space to be the cardinality 
of any basis for the space. 


Notation. We use the notation dim, (U) to denote the dimension of a vector space 
U over the field K. If the base field is understood, we simply write dim(U). 


Example 3. dim(K(N)) = Xo = dim(P). @ 
Example 4. In example 8 in section 3.2, we proved that the set of func- 
tions {f,(x) =e** : a ER} is independent in C™%(R). This shows that 


dim(C*(R)) > c. @ 


We now show the existence of a vector space of any given dimension. The essential 
uniqueness of such a space will be discussed in section 3.4. 
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Theorem 3.3.5. Let & be a cardinal number. Then there is a vector space of 
dimension &. 


Proof. If & is a finite cardinal, n, then IK" has dimension n. So assume that ® is 
infinite, and let I be a set such that Card(I) = &. We show that the space U = K() 
discussed in section 3.1 has dimension & by finding a basis for U which is in one-to- 
one correspondence with I. Let S = {eg}qey be the set of canonical vectors in K(D); 


Si 


's clearly in one-to-one correspondence with I. We show that S is a basis for U. Let 


{€q,> +++» Cc, $ be a finite subset of S, and suppose that, for some scalars ay, ...,Ay 
n 
f= Dye Mla, = 0 (the zero function from I> KK). For a fixed 1<j<n,0= 


f( 


a) = yy Ae, (4;) = a;. This shows the independence of S. Next we show that 


S spans U. Let f € U and let a, = f(a), ...,a, = f(@,,) be all the nonzero values 
of f. Clearly, f= yes Ae, 


_ 


Exercises 


. In this problem, the base filed is R. Let V, be the set of real symmetric n Xn 
matrices, and let V, be the set of skew-symmetric matrices. Show that V, 
and V, are subspaces of R,,,,,, and find their dimensions. An n Xn matrix 


A is skew-symmetric if, for all 1 <i,j < n,a; = —a 


ij jit 


2. Let V be a subspace of U. Show that dim(V) < dim(U). 


Ow 


ao 


jon 


N 


. Let U be an n-dimensional vector space, and let S be a subset of U of exactly 
n elements. Prove that the following are equivalent: 

(a) S is a basis for U. 
(b) S is independent. 
(c) S spans U. 

. Let V be a subspace of U. Show that if V contains a basis for U, then V = U. 

. Show that a vector space U is infinite dimensional if and only if it contains 
an infinite independent subset. 

. Let Ube an infinite-dimensional vector space. Show that there is a sequence 
V, D V2 D... (proper containments) of subspaces of Usuch that dim(V,,) = 
dim(U) for all n. 

. Let {xo, ...,x,} be a set of distinct real numbers. For 0 <i <n, define the 
following set of polynomials in P,,: 


=a x=) le — x) 
bo) = Gap =a)” 
7 (x — Xp (x — Xz)...(x — x,) 
TN) Ge Se) =e) Se 
(x — xX9)(x — x1)... — X,_1) 
(x; = Xo )(Xp = X1).%y 4 Xn—1) : 


.., and 


L(x) = 
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Show that the set {L,}/_y is a basis for P,,. Hint: For a polynomial fe P,,, 
show that f= ))_yf(x)L(x). Observe that L;(x)) = 5). 

8. Let I= [a,b] be a closed, bounded interval, and suppose a = t, < ty <...< 
t, = b is a fixed set of points (also called nodes) in I. Define V to be the 
set of continuous functions on [a, b] whose restrictions to the subintervals 
[t;,t;41] are linear. Prove that V is a vector space, and find a basis for it. 
A function in the space V is known as a continuous, piecewise linear 
function with nodes {t,, ...,t,,}. 

9. The space of continuous, piecewise linear functions. Let U be the collec- 
tion of all continuous, piecewise linear functions on [a, b]. Prove that U is 
an infinite-dimensional vector space. 

10. Show that R is an infinite-dimensional vector space over Q. 

11. Let M be a field, let L be a subfield of M, and let K be a subfield of L. 
We can consider L as a vector space over K, and M as a vector space 
over either L or K. Prove that if dim,(M) and dim,(L) are finite, then 
dim,(M) = dim,(M).dim,(L). 


3.4 Linear Mappings, Quotient Spaces, and Direct Sums 


A proper understanding of this section is essential for a smooth transition to the 
rest of the book. While the early results in the section are elementary, a number 
of important concepts make their first debut later in the section. Specifically, this 
includes quotient spaces and quotient maps, direct sums, projections and algebraic 
complements, linear functionals and linear operators, maximal subspaces and the 
co-dimension of a subspace and, finally, the definition of an algebra over a field. 


Definition. Let U and V be vector spaces over IK. A mapping T : U > Vis said to 
be linear if, for all u,v € U, and all ae kK, 


T(u+ v) = T(u) + Tv), and T(au) = aT(u). 
The following are examples of linear mappings. 
Example 1. Define T: P > P by T(f) =f’ (the derivative of f). @ 


Example 2. Let A be an mXn matrix with entries in K. The linear mapping 
T : K" > K”, defined by T(x) = Ax, is known as the mapping induced by the 
matrix A. It is easy to check that every linear transformation T : K” > Kk” 
is induced by the mxXn matrix A whose columns are T(e,), ..., T(e,). Here 
{e,,..-,€,} is the canonical basis for K”. The matrix A is called the standard 
matrix of T. @ 


62 FUNDAMENTALS OF MATHEMATICAL ANALYSIS 
Theorem 3.4.1. If T : U > V is linear, then 


(a) TO) =0; 

(b) T(—u) = -T(u); 

(c) ifay,...,a, € Kand uy, ...,u, € U, then TS ait) = ee 
(d) the image under T of a subspace of U is a subspace of V; and 

(e) the inverse image under T of a subspace of V is a subspace of U. 


a;T(u;); 


Definition. Let T : U— V be linear. The kernel (or null-space) of T, written 
Ker(T) or N(T), is T~'(0). The range of T is defined by R(T) = {T(u) : u € U}. 


Theorem 3.4.2. Let T: U— V be linear. Then 


(a) N(T) is a subspace of U, 
(b) RCT) is a subspace of V, and 
(c) T is one-to-one if and only if N(T) = {0}. 


Example 3. Let T : P  P be defined by T(f) = f, f(ddt. It is easy to verify 
directly that N(T) = {0} and that T is one-to-one. 


Definition. Let T : U— V be linear. The rank of T is the dimension of R(T) 
and the nullity of T is the dimension of V(T). The rank and nullity of a linear 
mapping are particularly useful when they are finite. 


Theorem 3.4.3. Let U be a vector space of dimension n < oo, and let T be a linear 
transformation from U to a vector space V. 
Then dim(Ker(T)) + dim(R(T)) = n. In other words, 


rank(T) + nullity(T) =n. 


Proof. Let S, = {u,, ...,u,} be a basis for Ker(T). Augment S, to a basis {u,, ...,Un} 
for U. We show that rank(T) =n—r by showing that {T(u,+,), ...,T(u,)} is 
a basis for R(T). Every element y in R(T) has the form T(x), where x € U. 
Write x = pa a,u;, then y= Ts aju;) = eee a;T(u;). This shows that 
T(Uy41), ---» Tn) Span R(T). Suppose, for some scalars b,41,..., by, een b; 


n n n 
T(u;) = 0. Then Ty seg b;u;) = 0, and pie b;u; © Ker(T), so iar bju; = 
De Ait for some scalars a,,...,a,. This would contradict the independence 
of {uy,...,Un}, unless ay,...,a,, and b,4,,...,b, are all zero. This shows the 


independence of T(u,+,),...,T(u,) and concludes the proof. 


Example 4. Let T: P,, > P,, be defined by T( f) =f’. Clearly, N(T) consists 
of all constant functions. Now R(T) = Span({1,x,...,x"~'}) because if 
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n-1 ajxit 


fonk, . 1 

g= we a;x!, then T( f) = g, where f = DaHG ae Observe that rank(T) = n, 
= seg 

and nullity(T) = 1, consistent with theorem 3.4.3. @ 


Theorem 3.4.4. Let S = {ug}qey be a basis for a vector space U, and let {vq hue, be 
an arbitrary subset of a vector space V. Then there exists a unique linear mapping 
T : U> Vsuch that, for every a EL, Tug) = Ve. 


Proof. Every vector x € U has a unique representation as x = reer tate for some 
finite subset F CI. Define T(x) = Yi yep 4aVai T is clearly linear (the interested 
reader is encouraged to formulate the notation needed to write out the details). 
To show that T is unique, suppose S : U > Vis another linear mapping such that 
S(Ug) = Va. Let x = Vi ep Fala © U. Then 


S(x) = sO, AgUg) = me AgS(Ug) 


aeF aeF 
- a ava = > Ag Tug) = Oo AgUq) = T(x). 
aeF aeF aeF 


The above theorem says that a linear mapping is completely (and uniquely) 
determined by its values on a basis. Stated differently, an arbitrary function on 
a basis for U can be uniquely extended to a linear function on U. 


Example 5. Let S = {1,x,x’,...} be the canonical basis for P, and define T : S > P 
by T(1) = 0, T(x) = 0, and, for n > 2, T(x") = n(n—1)x"~?. It is clear that the 
unique linear mapping on P that extends T is T( f) =f” (the second derivative 


off.) 


Definition. A linear mapping T : U—> V is an isomorphism if it is a bijection. 
In this case, we say that U and V are isomorphic. Isomorphic spaces may have 
different underlying sets and different operations, but, from the algebraic point 
of view, they are essentially identical. 


Example 6. P,, is isomorphic to K"*! because the linear mapping 
‘Oe a;x') = (do, 4), ...,4,) is an isomorphism. 


Example 7. The space P of all polynomials is isomorphic to K(N). 
Let f= ye a;x' € P. The following linear mapping is an isomorphism: T(f) = 
y, where y is the sequence (yo, ;, 2, .-.) such that 


a; if0<i<n, 
0 ifi>n¢ 


i= 
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In theorem 3.3.5, we established the existence of a vector space of any given 
dimension. We now show that such a space is unique up to an isomorphism. 


Theorem 3.4.5. An n-dimensional vector space U is isomorphic to KK". 


Proof. Let {u,,...,u,} be a basis for U, and define T : U— K" to be the unique 
linear mapping that extends T(u;) = e;,1 <i <n; T is clearly one-to-one, and it 
is onto because its range contains the canonical basis for KK". 


Theorem 3.4.6. Let U be a vector space of infinite dimension &, and let I be a set 
such that Card) =. Then U is isomorphic to K(). 


Proof. Let {ug}qe; be a basis for U, and let {e,,} be the canonical basis for KCI). If 
wer fae is the unique representation of an element x € U as a finite linear 
combination of the basis elements, define T : U > KK(1) by T(x) = kee Aga: 
The proof that T is an isomorphism is much like the proof of the previous 
theorem. @ 


Quotient Spaces 


Let V be a subspace of a vector space U. Define a relation R on U by xRy if 
x—y € V. It is easy to verify that R is an equivalence relation. The equivalence 
classes of R are subsets of U of the form x+ V={x+v: ve V}. Such a set is 
called a coset of V. 


For example, let U = R? and let V be a one-dimensional subspace of U. Then V is 
a straight line containing the origin, and the cosets of V are lines parallel to V. @ 


Definition. The quotient space U/V (read U modulo V) consists of the cosets of 
V, endowed with a vector space structure by the operations 


(x+V)+tV)=(+y4+V 
and 
a(x + V) = (ax) + V. 


The above operations are well defined in the sense that they do not depend on the 
particular element x chosen to represent the coset x + V. For example, if x’ + V= 
x+Vandy'+V=y+V,thenx’—-xeEV,y' —yeV,and(x'+y')—(+yJ EV; 
hence (x + y’)+ V = (x+y) + V. For brevity of notation, the coset x + V will be 
denoted by x. 


Definition. Let V be a subspace of a vector space U. The function z : U> U/V, 
defined by z(x) = x, is called the quotient map. It is easy to verify that 7 is 
linear. 
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Theorem 3.4.7. Let T : U-> Wbe linear, and let V = Ker(T). Then U/V is isomor- 
phic to R(T) via the isomorphism T(x) = T(x). 


Proof. We leave it to the reader to verify that T is well defined. Clearly, T is onto. We 
verify the linearity of T: 


T(ax + by) = T(ax + by) = T(ax + by) = aT(x) + bT(y) = aT(X) + bTO). 


To show that T is one-to-one, suppose T(x) =0. Therefore T(x)=0, and 
x € Ker(T) = V; hence x = 0. 


Direct Sums 


Definition. Let U, and U, be subspaces of a vector space U. The sum of U, and 
U, is the set U, +U, ={x+y:x€U,,y€ U,}. It is clear that U, + U, is a 
subspace of U. 


Example 8. Let U= R?, and let U, and U, be distinct lines containing the origin. 
Then the subspace U, + U, is the plane that contains U, and U;. 


Theorem 3.4.8. U, + U, = Span(U, U U;). 


Definition. A vector space U is the direct sum of two subspaces U, and U, if 
U, + U, = U, and U, NU, = {0}. In this case, we write U = U, @ U, and say 
that U, is an algebraic complement of U, in U. 


Example 9. R? = {(x,,0) : x, € R}@{(0,x,) : x» ERE 


Theorem 3.4.9. Let U, and U, be subspaces of a vector space U. Then U= U, ® Uy 
if and only if every vector u € U can be written uniquely as u = u, + uy, where 
u,€ U,, uz E U3. 


Proof. Such a representation of u is guaranteed by the definition of a direct sum. To 
prove the uniqueness, suppose u, + Uy = Vv; + V2, where u,,v, € U, and uz, v2 € 
U,. Then u,—v, =v,—u,€U,NU,={0}, so u;—vj) =v,-—u,=0 and 
U, = Vv, and Uy = v>. The converse is straightforward. 1 


Example 10. Let c be the space of all convergent sequences, and let cy be the 
space of all sequences that converge to 0. We show that c= cy @ Span({e}), 
where e =(1,1,1,...). Let x = (x,,x,...) be a convergent sequence, and let 
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€ =lim,,x,. Define y = (x, — €,x, — &,x;—§....), and let z= €e =(€,€,6,...). 
Clearly, y converges to 0 (ie, yEco), and x =y+z. The representation of 
x =y+z is unique because the only constant sequence that converges to 0 is 
the zero sequence. 


Theorem 3.4.10. Every subspace U, of a vector space U has a complement in U. 


Proof. We need to show that there is a subspace U, of U such that U = U, ® Uj. Let 
S, be a basis for U,. Augment S, to a basis S of U, and let S; =S—S,. If Uy = 
Span(S,), then U= U, ® U,. We leave it to the reader to write out the details. @ 


Definition. Let U = U, @ Up. The projection z, : U > U, is the linear mapping 
7,(u) = u,, where u = u, + uy, and is the unique representation of u provided 
by theorem 3.4.9. The projection 7, onto Uj is defined similarly. Some of the 
properties of projections are explored in the section exercises. 


Example 11. R? = {(x,,0) : x; €R}@{(0,x)) : x. ER}. The projection 7, 
projects R? onto the x,-axis in the sense of elementary geometry. @ 


Theorem 3.4.11. Let U= U, ® U3. Then Ker(,) = U2, and U/U, is isomorphic 
to Uj. 


Proof. To verify that Ker(7,)=U,, let x=u,+uy, where u, € U,,uz € Up. 
7,(x) = 0 if and only if u, = 0, if and only if x = uy € Uj. The fact that U/U,j is 
isomorphic to U, follows from theorem 3.4.7. 


Linear Functionals and Operators 


A particularly important set of linear transformations is that from a vector space 
U to the base field K. 


Definition. A linear mapping from a vector space U to the base field Kk is called a 
linear functional on U. 


The following are examples of linear functionals. 


Example 12. Defined : P > R byA(f) = Sf, fedex. (The base field is R and the 
polynomials have real coefficients.) @ 


n 


Example 13. Defined : K,., > K by A(A) = 9) _, aj. Here A = (aj) is ann xn 
matrix. The quantity 2 a;; is called the trace of A, often written tr(A). @ 
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Theorem 3.4.12. Let M be a subspace of a vector space U. The following are 
equivalent: 


(a) M = Ker(A) for some nonzero linear functional A on U. 
(b) M has a one-dimensional complement. 


Proof. (a) implies (b). Let x € U be such that A(x) # 0. By replacing x with x/A(x), 
we may assume that A(x) = 1. For y € U, let w= y—A(y)x. Then A(w) = AY) — 
A(A(y)x) = Aly) —AVY)A(x) = 0. This shows that w € Ker(A) = M; hence y= 
wtda(y)x € M+ Span({x}), and U= M+ Span({x}). Next we show that Mn 
Span({x}) = {0}. This will complete the proof. If ye Mn Span({x}), then y = ax 
for somea€ K, and A(y) = 0. But A(y) = ad(x) = a. Thus a = 0, and y = 0. 

Conversely, suppose that U = M @ Span({x}) for some nonzero x € U. Let S, 
be a basis for M, and let S = S, U{x}. Then S is a basis for U. DefineA: S> IK 
by A(x) = 1, and A(u) = 0 for all u € S,. Finally, extend A to a linear functional, 
which we also denote by A, on U according to theorem 3.4.4. The reader can easily 
verify that Ker(A) = M. @ 


Example 14. Refer to example 12. Let M = Ker(A). The following facts are easy to 
verify: A basis for M is {x” — — : n EN}, and the one-dimensional subspace 
N of constant polynomials is a complement of M. Every polynomial f can be 
written as f= g+c, where c = S, f(dt, andg=f—c.@ 


Definition. A proper subspace M of a vector space U is said to be a maximal 
subspace if it is not properly contained in any other proper subspace of U. 


Theorem 3.4.13. For a subspace M of a vector space U, each of the following is 
equivalent to each of the conditions (a) and (b) of the previous theorem: 


(a) M is a maximal subspace of U. 
(b) U/M has dimension 1. 


Definition. If dim(U/M) = 1, M is said to have co-dimension 1. More generally, 
if V is a subspace of U, then dim(U/V) is called the co-dimension of V in U. 
The concept is particularly useful when dim(U/V) < oo. 


Another important vector space is the space of all linear transformations from one 
vector space U to another space V. 


Notation. Let U and V be vector spaces. The set of all linear transformations 
from U to V is denoted by Hom(U,V). A linear mapping is also called a 
homomorphism, hence the notation Hom(U, V). 
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It is easy to see that Hom(U, V) is a vector space with the following operations: 
for T;, Ty € Hom(U, V) and a € K, (T; + T,)(u) = T,(u) + T,(w), and (aT,)(u) = 
aT,(u). 


An element of Hom(U, U) is often called a linear operator on U. Hom(U, U) has 
additional structure provided by the composition of linear operators; if $,T € 
Hom(U, U),(SoT)(u) = S(T(u)). When there is no danger of ambiguity, we write 
ST for SoT. The composition of linear operators satisfies a number of properties 
including, for example, S(T; + T;) = ST, + ST. 


Definition. A vector space U over a field K is an algebra over K, if U possesses 
another binary operation, to be called multiplication, such that, for all u,v, w € 
U, and all a € K, the following conditions are met: 


(a) u(vw) = (uv)w 
(b) uv+w) = uv+ uw, and (ut+ v)w = uwt+ vw 
(c) a(uv) = u(av) = (au)v 


The multiplication operation in an algebra is not necessarily commutative, and 
an algebra need not contain a multiplicative identity element, although many 
important algebras do. The simplest example of an algebra is the space of square 
matrices K,,.,,, where the binary operations are addition and multiplication of 
matrices. 


Theorem 3.4.14. Suppose Uand V are vector spaces over a field K. Then Hom(U, V) 
is a vector space, and Hom(U, U) is an algebra over KK. @ 


Exercises 


1. Prove theorem 3.4.1. 

2. Prove theorem 3.4.2. 

3. Let T: U> V be linear, and let S, be a basis for Ker(T). Augment S, to a 
basis S of U, and let S, = S—S,. Prove that T(S,) is a basis for R(T). This 
result is a generalization of theorem 3.4.3 when dim(U) = oo. 

4. Show that if there exists a linear mapping that maps U onto V, then 
dim(V) < dim(U). 

5. Show that if there exists a one-to-one linear mapping from U to V, then 
dim(U) < dim(V). 

6. Let {xo, ...,x,} be a set of distinct real numbers. Show that the mapping 
T: P, > K"t! given by T(f) = (f(%o), ---.f(%,)) is an isomorphism. 


11. 


12. 


13. 


14. 


15. 
16. 
17. 
18. 


19. 
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. Prove theorem 3.4.8. 
. Give an example to show that the algebraic complement of a subspace is not 


unique. 


. Prove that if U = U, ® U;, then dim(U) = dim(U,) + dim(U,). 
10. 


In this problem, the base field is R. Let U, be the subspace of R,,., of real 
symmetric n X n matrices, and let U, be the subspace of skew-symmetric 
matrices. Show that R,,.,, = U, ® U. 

Let U= U, ® U,, and, for i= 1,2 let T; be a linear mapping from U; to a 
vector space W. Prove that there exists a unique linear mapping T : U> W 
such that T|y, = T;. 


Definition. Let T : U— U be linear. A subspace V of U is said to be T- 
invariant if T(V) C V. 


Prove that if Vis a T-invariant subspace of U, then the mapping T : U/V > 
U/V, defined by T(x + V) = T(x) + V, is linear. Part of the exercise is to 
show that T is well defined. If 7 : U— U/Vis the quotient map, show that 
moT = Ton. 

Let U, and U, be subspaces of a vector space U such that U= U, @ U;, 
and let z : U— U, be the projection of U onto Uj. Prove that 7” = zr, that 
U, ={x EU: a(x) =x}= R(z), and that U, = Ker(z). By definition, 
n(x) = m(2(x)). 


Definition. Definition. A linear operator T on a vector space U is said to 
be idempotent if T? = T. The problem above says that the projection 7: of 
U onto U, is idempotent. 


Let z be a linear idempotent operator on a vector space U. Prove that 
U=U, ® Up, where U, = {x € U: 2(x) = x}, and U, = Ker(z). 

Prove theorem 3.4.13. 

Exhibit a basis for the null-space of the functional in example 13. 

Prove theorem 3.4.14. 

Let U be an n-dimensional vector space, and let U* = Hom(U, K). Suppose 
{u,,...,U,}is a basis for U, and, for each 1 < i < n, define a linear functional 
A; € U* by A(u;) = 6,1 <j <n) (see theorem 3.4.4). Prove that {a; : 1 < 
i < n}is a basis for U*. The space U* is called the dual of U. 

Let A,,...,A, be linear functionals on a vector space U, let M; = Ker(A,), 
and let N= N/i,Mj. Prove that dim(U/N) <n. Hint: Define T : U> K” 
by T(x) = (A, (x), ...,4,(x)). Use theorem 3.4.7. 
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3.5 Matrix Representation and Diagonalization 


A careful reading of example 2 in section 3.4 reveals that the set of linear mappings 
from K" to K” is in one-to-one correspondence with the set of m xn matrices. 
This section generalizes this result. Suppose U and V are finite-dimensional vector 
spaces and that {u,,...,n,} and {v,,...,v,,} are bases for U and V, respectively. 
Theorem 3.4.4 states that a linear mapping T : U > V is uniquely determined by 
the vectors T(u,), ..., T(u,,). Since each of the vectors T(u;) can be uniquely written 
as a linear combination of {v,, ...,v,,} with coefficients in KK, the set of coefficients 
determines T uniquely. This observation is the basis for the opening definition of 
this section. The information in this section is standard, and we assume familiarity 
with its contents. 


Matrix Representations of Linear Mappings 


Let Uand V be finite-dimensional vector spaces, and let n = dim(U), m = dim(V). 
Fix a pair of bases B = {u,,...,u,} and C={v,...,v,,} for U and V, respectively. 
If TE Hom(U, V), then, for every 1<j<n,T(u;) can be written as a linear 
combination of C, say, T(u;) = ye Avi. 


Definition. Given the construction in the previous paragraph, the matrix 
A = (aj) is called the matrix of T relative to the base pair (B,C). 


The matrix representing a linear mapping is totally dependent on the base pair 
(B, C) and is even sensitive to the permutation of the elements in each basis. Thus 
the bases B and C are assumed to be ordered. 


Example 1. Consider the linear transformation T : K” > K” induced by an 
m Xn matrix A. It is clear that the matrix of T relative to the base pair (B,,,B,,,) 
is A, where B, and B,, are the canonical bases for KK” and K”, respectively. @ 


Example 2. Let T : P,, > P,, be the linear transformation T( f) = a ; 
If B= {1,x, ...,x”}, then the matrix of T relative to (B, B) is 


Theorem 3.5.1. The function ® : Hom(U,V) > K,,,., that assigns to an element 
T € HomUU, V) its matrix relative to the base pair (B,C) is an isomorphism. ™ 
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Now we study how the matrix of the composition of two linear transformations 
relates to the matrices of the composed transformations. 


Theorem 3.5.2. Let U, V, B, C, and T be as in the above definition, let D= 
{w,, ...,Wp} be a basis for a third vector space W, and, finally, let S € Hom, W). 
IfA= os ;) is the matrix of T relative to the base pair (B,C), and A’ = (a, ” is the 
matrix of S relative to the base pair (C,D), then the matrix product A' A is the 
matrix of SoT relative to the base pair (B, D). 


Proof. Let 1 <j <n. Then 


(SoTIu) = S(TE)) = S( Yay) =) 


m 


P 
1 Dy ay aij AW, 
P m Pp m 
t t 
k=1 k=1 i=1 


i= 


Thus the matrix of SoT relative to (B, D) is E = (e,;). By the definition of matrix 
multiplication, e,; is the (k,j) entry of the product A’ A. 


The above theorem is the crucial piece of information needed to prove the 
following theorem, which is a special case of theorem 3.5.1 when V = Uand C= B. 


Theorem 3.5.3. The function ® : Hom(U,U) > K,,.,, that assigns to an element 
T € Hom(U, VU) its matrix relative to the base B (i.e., relative to the base pair 
(B,B)) is an algebra isomorphism. Thus, in addition to being linear, ® satisfies 
@(SoT) = O(S)®(T). i 


Definition. Let U be an n-dimensional vector space, and let B = {u,,...,u,,} and 
B= {u,, ais u,} be two bases for U. Every vector u; € B is a linear combination 
of the base B’ , uj = pe ‘ Pit, The resulting matrix P = (p;j) is called the matrix 
from B to B’. 


It is important to understand that the matrix P from B to B’ is the matrix of the 
identity transformation I, : U > U relative to the base pair (B, B’). 


Example 3. Notice that if P is the matrix from B to B’, then P~! is the matrix from 
B’ to B. Indeed, let Q be the matrix from B’ to B, and consider the mapping 
T = Iy with the base pair (B, B’) (its matrix is P), and the mapping S = Iy with 
the base pair (B’, B) (its matrix is Q). Consider the matrix of the composition 
SoT. On the one hand, its matrix relative to (B, B) is QP by theorem 3.5.2. On the 
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other hand, the matrix of Iy relative to (B, B) is the identity matrix I,,. Therefore 
QP=I,,andQ=P 1.4 


Example 4. Given a basis B for U and an invertible n x n matrix P, there is a basis 
B’ for U such that P is the matrix from B to B’. To see this, let Q = (q,;) be the 
inverse of P, and define u, = Ys qiu;. The set B’ = {u,, ..+)Uy} is a basis for U 
(see problem 3 at the end of this section), and the matrix from B’ to B is Q. By 
example 3, P is the matrix from B to B’. @ 


Example 5. As another application of problem 3, let B’ = {Po, ...,P,} be polyno- 
mials such that, for each 0 <i <n, P; has exact degree i. Then B’ is a basis for 
P,. To see this, write P; = par qijx! . The lower triangular matrix Q = (qj) is 
invertible because its determinant is Less gi #0. Since B= {1, ...,x”}is a basis 
for P,,, so is B’, by problem 3. @ 


The above discussion leads to the following. 


Theorem 3.5.4. Let U be an n-dimensional vector space. Then the collection of 
bases for U is in one-to-one correspondence with the collection of invertible nx n 
matrices. 


Proof. Fix a basis B for U. For another basis B’ for U, let P be the matrix from 
B to B’. The correspondence WV : B' +> P is the correspondence promised by the 
theorem. We leave the rest of the formalities to the reader. The examples preceding 
this theorem are relevant for verifying the details. 


Theorem 3.5.5 (change of base Formula). Let U and V be finite-dimensional 
vector spaces, and let n = dim(U),m = dim(V). Let B and B' be bases for U, and 
let Cand C’ be bases for V. Let T € Hom(U, V) and suppose that A is the matrix 
of T relative to a base pair (B,C) and that A' is the matrix of T relative to the base 
pair (B',C’). If P is the matrix from B’ to B and Q is the matrix from C’ to C, then 
A’=Q7!AP. 


Proof. Consider diagram 1. Each corner contains a pair: a space and a basis. The 
top arrow prompts the reader to consider the mapping T : U > Vand mind the 
bases indicated in the top corners of the diagram. Thus the matrix of the mapping 
T represented by the top arrow is relative to the base pair (B,C) and is therefore A. 
Likewise, the matrices representing the rest of mappings indicated on the diagram 
are Q~! for Iy, P~' for Iy, and A’ for the mapping depicted by the bottom arrow. 
Now IyoT = Toly. Applying theorem 3.5.2 to each side of the above equation, we 
getQ-'A=A'P!, orA’=Q''AP. 
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(U,B) —++ (V,C) 


[ uf {e 


(U,B') —> (VC) 
Diagram 1 


Corollary 3.5.6. Let U be an n-dimensional vector space, let T€ Hom(U, U), and 
let Band B’ be bases for U. If A is the matrix of T relative to B, and A’ is the matrix 
of T relative to B’, then A’ = P~'AP, where P is the matrix from B’ to B. 


Proof. This is the special case of the above theorem when V=U, C=B, and 
C'=B'.l 


Diagonalization 


When the matrix representing a linear operator T on a finite-dimensional vector 
space relative to a basis B is diagonal, the action of T on B is quite simple: T maps 
each element of B to a multiple of itself. The following question is natural: given an 
operator T € Hom(U, U), can you find a basis for U relative to which the matrix of 
T is diagonal? By corollary 3.5.6, the matrix equivalent of the question is as follows: 
given an arbitrary square matrix A, can you find an invertible matrix P such that 
P~'AP is diagonal? The answer to both questions is no. The following definitions 
formalize the discussion. 


Definition. A linear operator T on a finite-dimensional vector space U is diag- 
onalizable if U contains a basis relative to which the matrix of T is diagonal. 
Equivalently, U possesses a basis B consisting entirely of eigenvectors of T. 


Definition. A square matrix A is diagonalizable if there exists an invertible matrix 
P such that P~'AP is diagonal. 


The following theorem gives a necessary and sufficient condition for a square 
matrix (linear operator) to be diagonalizable. 


Theorem 3.5.7. A square matrix A is diagonalizable if and only if IK" has a basis 
consisting entirely of eigenvectors of A. 


Proof. Suppose A is diagonalizable. Thus there exists an invertible matrix P such that 
P~'AP =D, a diagonal matrix. Let A,, ...,A,, be the diagonal entries of D, and let 
P=[w,,...,u,] be a partitioning of A by its columns. The equation P~'AP = D 
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is equivalent to A[u,,...,u,,] = PD, or [Au,,...,Au,] = [A,uy, ...,4,u,]. Thus 
Au; = Aju; for1 <i<n, and{u, ...,u,}is a basis of KK" consisting of eigenvectors 
of A. To prove the converse, we simply reverse the above argument. 


We will discuss in section 3.7 a class of matrices that can be diagonalized in a 
very spacial way. We will also extend the discussion to infinite-dimensional spaces 
in chapter 7. We conclude the section with two examples of linear operators on 
infinite-dimensional spaces. In the first example, the operator has uncountably 
many eigenvalues; in the second, it has none. 


2 
Example 6. Let T be an operator on C*(R), defined by T(f) = < +f. It is easy 
to verify that, for every w € R, the function f,,(x) = sin (wx) is an eigenfunction 
of T corresponding to the eigenvalue 2,, = 1—@7. @ 


Example 7. Let T be an operator on C™(R), defined by (Tf/)(x) = xf(x). We verify 
that T has no eigenvalues. If T( f) = Af, then xf(x) = Af(x) for all x E R. This 
implies that f(x) = 0 for all x 4 A. The continuity of fimplies that f(A) = 0; thus 
f=0.4 


Exercises 


1. Let A and B be n Xn matrices such that AB = I,,. Prove that A is invertible 
and hence B= A7?. 

2. Let Ube a finite-dimensional vector space, and let T : U > Ubelinear. Prove 
that if V is an r-dimensional, T-invariant subspace of U, then there is a basis 
for U relative to which the matrix of T has the form 


( Au | Ap 
0 | Ax» }’ 
where A,, is an rX r submatrix. 

3. Given a basis B = {uy, ...,u,} for U and an invertible n x n matrix Q = (qj), 
define u, = ee, qij4;. Show that the set BY = {u,, wee5 un} is a basis for U. 

4. Let A be an nXn matrix and let P be an invertible n x n matrix. Show that 
A and P7'!AP have the same eigenvalues. It follows from this result that the 
eigenvalues of a linear operator T on a finite-dimensional space are those of 
the matrix representing T relative to any basis. 

5. Let U be a finite-dimensional vector space, and let T€ Hom(U, U). Prove 
that T is diagonalizable if and only if its matrix relative to any basis is 
diagonalizable. 


VECTOR SPACES 75 


6. Let U be a finite-dimensional vector space, and let T € Hom(U, U). Define 
det(T ) to be the determinant of the matrix representing T relative to some 
basis of U. Prove that det(T) is independent of the choice of the basis. 

7. Let T be a linear mapping from a finite-dimensional vector space U to a 
finite-dimensional vector space V. Prove that there exist a pair of bases for 
Uand V relative to which the matrix of T is diagonal. Hint: See theorem B.3 
in appendix B. 

8. Let T: P,, — P,, be the linear operator T( f) = tf Show that T is not 
diagonalizable. 


3.6 Normed Linear Spaces 


Let us examine the function d : R? > R, which assigns to a point (x,,x) € R? its 


1/2 : : 
distance from the origin. Thus d(x) = (xj +x) . The function dhas the following 
characteristics: 


(1) d(x) > 0 and d(x) = 0 if and only if x = 0. 
(2) For a real scalar a and a point x € R’, d(ax) = |a|d(x). 
(3) For x,y € R’, d(x+y) < d(x) + dQ). 


The abstraction of the function d to an arbitrary vector space yields the definition 
of a normed linear space. Instead of using the notation d(x), we use the universally 
accepted notation ||x|| for the length of a vector x, or its distance from the zero 
vector. 

Normed linear spaces are the most common examples of metric spaces. What 
sets norms apart, still using the function d on R? as our prototype, is the fact that 
the distance function between two points in the plane is translation invariant 
in the sense that if D: R? x R* >R is the function D(x,y) = {(x, —y,)? + 
(x, — yyy, then D(x,y)=D(x-—a,y—a) for all x,y,a€R?. Equivalently, 
D(x, y) = D(x—y,0) = d(x —y). See the definition of a translation later on in 
this section. This property makes no sense for a general metric space because the 
underlying set of a metric space is not required to be a vector space. 


Definition. A normed linear space is a vector space X over KK together with a 
function ||.|| : X > IR such that, for all x,y € X and all a EK, 


(a) ||x|| > 0 and ||x|| = 0 if and only if x = 0, 
(b) ||ax|| = a]||x||, and 
(c) [lx + yll Sl + IIL 
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The function ||.|| is called a norm on X, and condition (c) in the above definition 
is known as the triangle inequality. 


Motivated by the discussion about the translation invariance of the distance 
function in the plane, the following definition makes sense. 


Definition. The distance between two points x and y in a normed linear space X 
is the scalar ||x — y|]. 


The reader can easily verify that the defining conditions of a norm are satisfied 
in each of the examples below. 


Example 1. Let X = KK", and define the 1-norm of x = (x, ...,x,) by 
lth = Da bel. @ 
Example 2. Let X = KK”, and define the co-norm of x by 
[IX ]]oo = MAX <icn|X;|- © 


Example 3. Let J° be the space of bounded sequences discussed in section 3.1. 
The norm of a bounded sequence (x,,) is defined by 


IIlloo = SUP nen |%nl- 


Example 4. In section 3.1, we defined the space X = Bla,b] to consist of all 
bounded functions on the interval [a,b]. The supremum norm (also the 
uniform or co-norm) of a function f € X is defined by 


Ilflloo = SUP vefa,vj F(X). 


We verify the triangle inequality here. If fand g are bounded functions on [a, b] 
and x € [a,b], then |(f+ g)@)|_ < [fC)| + [g@)I S Iflloo + Il8lleo- 
Thus [f+ glloo = SUPxefa,o]|F+ 9G) S I[flloo + [Igloo 


Example 5. An important subspace of B[a, b] is the space C[a, b] of continuous 
functions on [a,b]. Both spaces are given the uniform norm. 


Example 6. Another useful norm on C[a, b] is the 1-norm defined by 


lifll: =JS, fold. 
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Example 7. Consider the following sequence of functions in C[0, 1]: 


‘ 1 
2n>x if0<x<—, 
1 1 Ba 
a 3 avs CY eae ens 
fa) = 4 -20"(x = if a <x< > 
0 if—<x<1l. 
n2 


It is clear that 


1 
IIfallco = f,(1/2n’) = 1 (fall a on 4 


Example 8. Define a function ||.|| on the space R,,.,, of n X n matrices as follows: 
let a,,...,a, be the rows of a matrix A and set 


lAlloo = MAX) <i<nllailli- 


The function defined above is a norm on R,,x,,. We verify the triangle inequality: 
if b,,...,b, are the rows of another matrix B, then the rows of A+ B are a, + 
by, ...,4, +0,, and, for 1 <i<n, 


la; + Billy < |lailly + [ill S maxy<ien|laill, + max, <i<n|lBill = |lAlloo + ||Blloo- 
Thus 


I|A + Blloo S IAlloo + IIBlloo- # 
The matrix norm in the above example is compatible with the co-norm on R” in 
the sense that, for x € R”, ||Ax||oo < ||All.o||*|].0, as the reader can easily verify. 
P Spaces 
We now define the rest of the P spaces. 


Definition. For every real number 1 <p < ov, define to be the set of all 
sequences x = (x,,x>,...) in K such that pe |x,/? < oo. Forx € P, 


« 1/p 
Isl, = (Sobol?) 
n=1 


It is straightforward to verify that /' is a normed linear space. For example, the 

triangle inequality is obtained from the following inequality upon taking the limit 
n n n 

asn— oo: D1 Ia + yil SD el + De, Wil: 


78 FUNDAMENTALS OF MATHEMATICAL ANALYSIS 


Showing that /? (for 1 < p < co) is a normed linear space is less straightforward 
and requires the development of two useful inequalities which are important in 
their own right. 


Definition. Let 1 < p < oo. The conjugate Hélder exponent of p is the number 
q > 1 such that Poe By definition, p = 1 and q = o are conjugate Hélder 
P 4 
exponents. 


lyl@ 


Lemma 3.6.1. If p> 1, and x,y€C, then |xy| < w+! Here p and q are 
P q 


conjugate Holder exponents. 


Proof. Consider the function f(t) = t'/? — ae Lf'O= *p/e-1_* <0. Thus 
P q P 


fis decreasing for allt > 1, and since f(1) = 0, it follows that f(t) < 0 for allt > 1. 


Thus tllP << 4 ~ for t > 1. Now let a,b > 0 and, say, = “> 1. By replacing t with 
Pic's 4 


1/p 
a/b, we obtain Ors ~(< jee Therefore, ~ —— — ae -, oralPpla< * oe - 
~ p Po 4 
Letting a = |x|P,b = |y|4, we abiain the ea we eek 7 


Theorem 3.6.2 (Hélder’s inequality). If x=(x,) EP and y=(y,) E14, then 
2= (nn) EF and |lalli < Ilellpllylla- 


Proof. Ifp=1,q= 00, x El’, andy € I, then 


co oe) 
Dural S SUP al ynl Dy nl = [bylloolllls- 
n=1 n=1 


ail il zt Lil 1 lyilt 


Now let 1 < p,q < oo. Applying lemma 3.6.1 . 
* [lsllp Ulvllg Pelle a Ib 


ES xP + Ly yl 


Pllellp i= laa 1 


ES xjP + LS gite = =1. 


* ils Ilp i= qllyila ia 


TRL" vil S 


The summary of the above calculations is that ly 7 if y _ |xivi] < 1. Taking the 
y 


limit as n — oo we obtain Holder's inequality. 
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Theorem 3.6.3 (Minkowsi’s inequality). Ifx = (x,),y = (y,,) €P, thenx+y e€P, 
and 


I< + yllp S llellp + lly Ilp- 


Proof. We already proved the theorem for p = 1 and p = oo, so assume that 1 < p< 
oo, and let q be the conjugate Hélder exponent of p. Then: 


Dili tyil? < Dlx; til Mail + yi). 
i=1 i=1 
Applying Holder's inequality to the right side of the above inequality yields 


n 
>, |x; + y;|P-" || + > Ix; + yi?" Lyi 


i=1 i=1 


” 1/p ‘ 1/p in 1/q 
< (dH) + (2 bt) | (2 |X; +] 
i=1 i=1 


i=1 


n Vq 
< (lolly + lbvllp) (> |x; +P) ; 


i=1 


The summary of the above calculations is that 


n n 1/q 
let vi? < (ball + lb) Dob ty) 
i=1 i=1 


Thus 


n 


1-1/q 
(Shit) <lsll+ ll 


i=1 


Taking the limit as n > oo, and recalling that 1— 1/q = 1/p, we have 


oo) 1/p 
(Syst) < lly + lly on lhe lly lll + Ip 
i=1 


We have verified all the crucial details needed to prove the result below. 


Theorem 3.6.4. For 1 <p < oo, I? is a normed linear space. 
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Observe that Holder’s inequality and the triangle inequality apply to finite 
sequences: 


n n 1/p n 1/q 
xi < (SH) (Sybil) , and 
i=1 i=1 i=1 


(Sb) < (x Ii) + (pw) 


i=] 


1/p 


Thus, for 1 < p < oo, (R”,||.||,) is a normed linear space. 


Balls, Lines, and Convex Sets 


Definition. Let X be a normed linear space. The open ball of radius r centered at 
x € X is the set 


Bix,r) ={yEX: |ly—x|| <r}. 


Example 9. In (R’,||.||,,), the open ball of radius r centered at the point (xp, yo) 
is the open square {(x,y) : |x—xo| <1,|y—yo| <r}. In (R?,||-||,), the open ball 
of radius r centered at (xg, yo) is the open square with vertices (xj +1,yo) and 
(xo,¥o +1). © 


Definition. Let u be a unit vector (i-e., a vector of length 1) in a normed linear 
space X, and let € be a fixed point in X. The line that contains & and is parallel 
to wis the set 


{E+ tu: —co <t < co}. 


Remark. The vector u in the above definition does not have to be a unit vector, but 
when it is, tis the exact distance between € + tuand €. An important special case 
is the equation of the line joining two points € and n in X. In this case, the line 
is the set of all points x such that 


x=E+ty-G)=(-DE+m, -wo<t<ow. 
The set {1 —H§ + itn : 0<t< l}is called the line segment joining and 7. 
Example 10. In R", the above definition reduces to the familiar definition 
of a straight line, especially when n= 2,3. Indeed, if u=(uj,...,u,), and 


& =(&,...,§,), then the parametric equation of the line containing € and 
parallel to u is 


x, = 6, 4+ tux =6,4+tu,...,x, =&,+tu,,-w0 <t<0.¢ 
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Definition. Let E be a subset of a vector space V, and let x € V. The set E+ x= 
{x+y : y € E} is known as the translation of E by x (or in the direction of x). 


The set E+ x can be visualized as rigidly moving E in the direction of the vector x. 
The graph of the parabola y = x? + 1 is the translation of the graph of the parabola 
y =x? by the vector (0,1). Figure 3.1 depicts the translation of the raindrop-like 
set E by the vector x = (0,—1). 

Translating a set preserves most of its characteristics. Convexity is a good 
example. 


Definition. A subset C ofa vector space Vis said to be convex if, for every €,7 € C, 
and all0 <¢< 1, (1—1t)§ + ty EC. Thus C is convex if whenever it contains § 
and n, it contains the line segment joining § and 7. 


Example 11. 
(a) An open ball in R” is a convex set. 
(b) Let A be an m Xn real matrix, and let b € R™. The two sets 
{x EIR" : Ax = b}and {x € R" : Ax > b} are convex subsets of R”.’ 
(c) The union of the first and third quadrants in the plane is not convex. 
(d) The raindrop region in figure 3.1 is not convex. 
(e) The intersection of an arbitrary collection of convex sets is convex. @ 


Figure 3.1 The falling raindrop 


3 The notation Ax < b means that 2 Ax; < b; for alll <i<m. 
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Excursion: Convex Hulls and Polytopes 


We limit the discussion below to R”, although some of the statements are valid for 
an arbitrary vector space. 


Definition. A convex combination of a finite set {x,, ...,x;,} C R” isa point of the 
k k 
form x = }),_ Aix; such that 0 <A; < land })_,a;=1. 


Theorem 3.6.5. A subset CCR” is convex if and only if it contains all convex 
combinations of points of C. 


. pr k . : 
Proof. It is enough to show that if Cis convex and x = )),_ , A;x; is a convex combina- 
tion of points x,,...,x; € C, then x € C. The converse is trivial. We use induction 


onk. The statement is true for k = 2 by the very definition of convexity. Without 
ko Ax; 
=2 12," 


€ C by the inductive hypothesis. By the 


loss of generality, assume that A, <1, and write x =A,\x,+(-2a,)> 


k Au ok Aix 
Now >). aa landy=))_, ae 


convexity of C, x=A,\x, + -A,)yEC mf 


Definition. The convex hull of a nonempty set A C R” is the smallest convex 
subset of R” that contains A. The notation conv(A) denotes the convex hull of A. 


It is clear that conv(A) is the intersection of all the convex subsets of R” that contain 
A and that conv(A) # @, since A C R", and R” is convex. 


Theorem 3.6.6. For a nonempty set ACR", conv(A) is the set of all convex 
combinations of points of A. 


Proof. By the previous theorem, it is enough to show that the set of all convex 
ey . . : k 
combinations of points in A is a convex set. Suppose that x = })._,Aj;x; and 


I : ee . 
y= pe Hi; are, respectively, convex combinations of points x,,...,x, and 
Yp.oy in A. Ifa € [0,1], then 
k l 
(l-a)x+ay= Sa — Aa)A;x; +>) amy; 


i=] j=l 


is a convex combination of x1, ...,X, 15 -+-»¥) because 


i=1 


k 1 
Sa —aya; +>) oy; =1.0 
j=l 
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A natural question now is whether there is an upper bound on the length of the 
convex combinations of vectors in A needed to generate all of conv(A). The next 
theorem provides the answer. 


Theorem 3.6.7 (Carathéodory’s theorem). Let ACR", and let C= conv(A). 
Then every point in C is the convex combination of, at most, n+ 1 points of A. 


k 
Proof. Let x € C. Then x = })_, A;x; for some x,,...,x, € A and some A,, ...,Ay € 
: k 
(0,1] with >), A;=1. If k>n+1, then the vectors x, —x,,...,X,—%, are 
linearly dependent, so there are constants [y,...,l,, not all zero, such that 


k k k k 
Dajan Mi — x1) = 0. Ifwe set u, = — pape Mj, then pes Mx; = 0, and pe B= 
0. Observe that at least one of the numbers [,,...,{4z is positive. Now, for all 


k , Ai . ay. 
aceR x= pC — a;)x;. Let i be such that no min cient > Mj > O} and 
choose & =~. Observe that «> 0 and for l\<j<k A;-ap, 20. Now x= 


Ki 
k k : 
AC — AMj)x;, A; — au; = 0, and din Ay — af4;) = 1. Since A; — au; = 0, x 
is a convex combination of, at most, k—1 points of A. We continue this process 
until x is a convex combination of, at most, n+ 1 vectors in A. 


Example 12. It is possible that k < n+ 1. The closed unit disk D in R? is the convex 
hull of the unit circle S', and every interior point in D is a convex combination 
of two vectors in S!. However, k = n + 1 is the best possible bound. For example, 
if xy,X,, and x, are three noncollinear points in the plane, then an interior point 
in the triangle defined by the three points is not a convex combination of any 
two of the three points. @ 


Definition. A polytope in R” is the convex hull of a finite subset of R”. 


Definition. A point x in a convex subset C € R” is said to be an extreme point 
of C if whenever y,z € C, and y # z, then, for any 2 € (0,1), x F4Ay+(U1—A)z. 
The extreme points of a polytope are more specifically called its vertices. 


A convex set may not have any extreme points. A simple example is the set 
{(%y) ER? 1 0<x<1,-00 <y< oo}. 


Example 13. If x, and x, are distinct points in R”, then the polytope 
Q=conv(x,,x,) is the line segment {(1 — f)x, + tx, : 0<t< 1}. It is easy to 
verify that x, and x, are the vertices of Q. @ 


The number of vertices of a polytope in R” is not related to the dimension n of 
the space. For example, a regular polygon in R? can have any number of vertices. 
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While it is intuitively obvious that a polytope is the convex hull of its vertices, 
this is not an entirely trivial fact. We prove this fact, together with the even more 
fundamental fact that polytopes do have vertices. 


Lemma 3.6.8. The vertices of a polytope Q = conv(x,, ...,x,) in R” are contained 
in {x1, ...,x;}. 


Proof. We show that if x€ Q and x# x; then x is not an extreme point of Q 
‘ : ieee: k 
By assumption, x is a convex combination of x), .--,X} say, x= DY), AjXp 
and assume, without loss of generality, that A, >0. Since x #x,, A, #1. Let 
koa ; : c, 
a ee aki Since y is a convex combination of X2,...,Xp, yEQ and 
rd | 


x=A,x,+(1—A))y. Since 0 <A, <1, x is not an extreme point of Q. Hl 


Lemma 3.6.9. Consider the polytope Q=conv(x,,...,x,). If x, is not a vertex 
of Q, then x; is a convex combination of x,,...,xX,—-,. Consequently, Q= 
conv(x), .--,X;—1): 


Proof. Since x; is not a vertex of Q, there exist convex combinations y = yy Bix; 
and z= oh yx; (y #z), and a number A € (0,1) such that x, =Ay+(1—A)z. 


Now x, = ee a;x;, where a; =AB;+ (1 —A)y,. If a, = 0, the proof is complete. 


It is easy to check that a, = 1 is possible only if 8, = yj, = 1. But this would force 
-1 4; 
ei feg te 


desired. We leave it to the reader to check that Q = conv(x,, ...,X,;_,). 


y =Z=x,, which is a contradiction. Thus 0 < a, <1 and x, =>) 


Theorem 3.6.10. The set of vertices of a polytope Q = conv(x,, ...,X,) is not empty, 
and Q is the convex hull of its vertices. 


Proof. We prove the result by induction on k. The result is true for k = 2 by example 
13. Now consider the polytope Q = conv(x), ...,x,). If all the points x,,...,x, are 
vertices of Q, there is nothing to prove. Otherwise, one point, say, x; is not a vertex 
of Q. By the previous lemma, Q = conv(x,, ...,X,_1). By the inductive hypothesis, 
Q is the convex hull of its vertices. 1 


The fact that a polytope is the convex hull of its extreme points is the weakest 
version of the well-known Krein-Millman theorem. 


An important special type of polytopes is the simplex. 
Definition. The standard n-simplex, T,,, in R” is the convex hull of the n+1 


vectors 0,e), ...,€,. In general, if {xg, ...,x,,} C IR” is such that x, — xp, ...,x%,— 
Xo are independent, the set conv{xo, ...,x,} is called an n-simplex in R”. 


VECTOR SPACES 85 
The standard 2-simplex is a triangle with vertices (0,0),(1,0), and (0,1). The 
standard 3-simplex is a pyramid with vertices (0,0,0), (1,0,0), (0,1,0), and 
(0,0, 1). 
Every point x in the standard n-simplex can be written uniquely as x = yj A ei, 
where A; € [0,1], and yy Al <1. Set Ap =1- ye The numbers do, ...,A, 
are called the barycentric coordinates of x. 
Exercises 
1. Show that, for elements x and Y of a normed linear space, 
IIlx1| — IlyIIl S Ilx +yI1- 
2. Let a), ...,a, be the columns ofa real m x m matrix A. Prove that the function 
|All, = max, <icnllailhi 


defines a norm on R,,.,, and that, for x € R”, 


I|Ax|]1 < [Alb Ilh- 


3. Show that ifl1<r<s<oo,thenl’CP. 

4, Show that if x €/’, then lim... |||], = ||*loo- 

5. Show that in a normed linear space, B(x,r) + Bly,s) = B(x +y,r+s). 

6. Prove that the translation of a convex subset of IR” is convex. 

7. Prove that x is an extreme point of a convex set C if and only if C— {x} is 
convex. 

8. Prove that 0,e,, ...,e,, are the vertices of the standard n-simplex. 


3.7 Inner Product Spaces 


The concept of an inner product stems out of the need to have an instrument that 
determines the orthogonality of vectors in a normed linear space. Let us consider 
the Euclidean norm on R”. The orthogonality of two vectors x = (x,,...,x,,) and 
y=()),---.¥,) in R" is equivalent to the condition that ||x+ y||? = ||-||? + |ly||’. 
(the Pythagorean theorem). Now 


n n n n n 
AL 7 7 
IIx + yl? = Gi ty? = De + Do +2) xii = lll? + IP +2) x7 
i=1 i=1 i=1 i=1 i=1 


Thus the orthogonality of x and y is equivalent to the condition ee xy; = 0. 
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This suggests that we examine the function B: R” XR" —>R defined by 
Bix, y) = ae x;y; A little reflection reveals that B is linear in each of its 
arguments, that B(x,y) =0 if and only if x and y are orthogonal, and that B 
defines the norm in the sense that B(x,x) = ||x||?. Therefore a function B with the 
above properties may be a useful specialization of norms in that it provides a tool 
for defining orthogonality. Abstracting the above discussion leads directly to the 
definition of an inner product. 


Definition. An inner product on a vector space H is a function 
(.,.) : HX H > K such that, for all x,y,z © H and all scalars a € K, 


(a) (x,y) = (y,x), 

(b) (x+y,2z) = (x,z) + (y,2), 

(c) (ax,y) = a(x,y), and 

(d) (x,x) > 0, and (x,x) = 0 if and only if x = 0. 


A vector space H with an inner product is called an inner product space. 


Example 1. The standard inner product on C” is defined by 


(x,y) = es: = y"x. 


Consistent with matrix notation, we write vectors as columns and y* as the 
conjugate transpose of the vector y, while y*x is conveniently thought of as a 
matrix product. The standard inner product on R” is defined by 


(xy) = DL xy Hye = xTy. 


Example 2. The space /? is an inner product space with the inner product 
(4) = Dyn XW, 


Example 3. The space C[a, b] is an inner product space with the inner product 
ge 
he =S, fg(aydx. + 
Of particular interest to us is the special case when a = —71, b = 77. In this case, we 
define 
(68) = — SJ fog@de. 


The normalization constant 1/27: is included for convenience, as will become clear 
later in this section. 


For an element x in an inner product space H, we write ||x|| = -¥ (x, x). We will see 
shortly that ||.|| is indeed a norm on H. 
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Theorem 3.7.1 (the Cauchy-Schwarz inequality). If H is an inner 
product space, then, for all x,y € H, 


Kx,y)I S |lallllyll- 
Equality holds if and only if x and y are linearly dependent. 


Proof. Without loss of generality, assume that y # 0. For a € KK, 


0<|lx+ ay|? = (x + ay,x+ cy) 
= (x, x) + aly, x) + a(x, y) + larly.) 


2 
Substituting a = —(x,y)/|ly||?,. we obtain 0 < ||x||* — Keyl from which the 
a ee bP 


Cauchy-Schwarz inequality follows. 
It is easy to verify that if y = ax, then |(x,y)| = ||x||||y||. Conversely, suppose 
that |{x,y)| = ||xIlllyl|. Now 


IINlvl?x — (x, ydyll? = (lyII?x — Ga y)y, IIyII?x — Ga ydy) 


= [Lyll* lll? = [byl 9,2) = IIIa yoy) + Ke yDP IDI? 
= [Lyll?{llo1l? lly? — x,y) 173 = 0. 


Thus ||y||?x — (x, y)y = 0, and x and y are dependent. 
Corollary 3.7.2 (the triangle inequality). In an inner product space H, 
Ilx+ yll S [lel + byl 
Proof. Using the Cauchy-Schwartz inequality, 


IIx + yl? =(xt+y,x+y) = [Ix]? + Cy) + yx) + III? 
= ||x||? + 2Re(x,y) + |lyIl? < lll? + 2[kxllILyll + Ilyll? = Cll + [ly|D?- 


Taking the square roots of the extreme sides of the above string yields the triangle 
inequality. 


It follows from the above corollary that the function ||x|| = (x, x)? defines a norm 
on H. Therefore every inner product space is a normed linear space. 


Definition. Two vectors x and y in an inner product space are said to be orthog- 
onal if (x,y) = 0. Symbolically, we write x 1 y to indicate the orthogonality of 
xand y. 
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Theorem 3.7.3 (the Pythagorean theorem). If x and y are orthogonal vectors in 
an inner product space, then 


llx + yll? = Ilell? + IyIP. 


Proof. 
Ilx+ yl? = (ety. x+y) = (x, x) + (xy) + (yx) + ry) 
= (x,x) + (yy) = Ill]? + [ly|?- 
The Pythagorean theorem can be easily generalized as follows: if x,,...,x, are 


mutually orthogonal elements of an inner product space, then 
Dez 2 2 
ey tne Xgl? = IPeall? + + lal. 

Definition. A subset S of an inner product space H is said to be orthogonal if the 
vectors in S are pairwise orthogonal. If, in addition, each vector in S is a unit 
vector, then S is called an orthonormal subset of H. We always assume that an 
orthogonal subset excludes the zero vector. 


Example 4. The canonical vectors in IK” form an orthonormal set. @ 


Example 5. The set of functions u,,(t) = {e'"" : n € Z} is orthogonal in C[—z, 7] 
and the inner product (f,g) = = J fle) g@dx. (Here e® = cos@ + isin@.) 
Indeed, if m and n are distinct integers, then 


(Uys Um) = an : elt er int dt — =), eiln—m)t c =0 
ie oa re 27i(n — m) i; 


while 


7 

it : ‘ 

ibis Un) = if elt e—intdt = 1, 
20 J_a 


Observe the convenience of including the factor 1/27 in the definition of the 
inner product. @ 


Theorem 3.7.4. An orthogonal subset S of an inner product space H is independent. 


Proof. Let {u,,...,u,} be a finite subset of S, and suppose that, for scalars a, ...,d,) 
pe aju; = 0. For a fixed1 <j <n, 


n n 
o ait) = Y aidujs 4) = dj. 
i=1 i=1 
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But 


n 
(> ajuij, u) = (0, uj) =0. 
i=1 


Therefore a; = 0, and {u,,...,u,} is independent. 


We now pause briefly in order to get confirmation that the concepts we 
have developed so far are consistent with the geometry of IR”. The Cauchy- 


Schwartz inequality, which we write as TT <1, implies that there exists a 
x{IIy" 
(x,y) 


ccllly 
lly — lI? = Ilexll? + ILyll? — 24x.) = [lll? + Ilyll? — 2Ilallllyllcos@. When n = 2, the 


last identity is the well-known law of cosines in elementary trigonometry. The 


unique number 6 € [0,77] such that cos@ = or (x,y) = ||x|||lyl|cos@. Now 


number 8, of course, is the angle between x and y. 


We continue to exploit the geometry of vectors in R? to get direction for the next 
step. An important concept in geometry (and in Hilbert space theory) is that of 
projecting a vector onto another. Let x € R? and let u be a unit vector in the plane. 
The length of the projection of x onto u is given by ||x|| cos@ = (x, u); hence the 
vector projection of x onto the line containing u is the vector y = (x, u)u. This is 
the closest vector in the line containing u to the vector x. Since the projection 
of a vector x € R* onto the span M of two orthonormal vectors u, and u, is the 
sum of the individual projections of x onto u, and uy, the projection of x on M is 
(x, Uy )U, + (Xx, UU. The constructions involved in the next two theorems are now 
well motivated. 


Theorem 3.7.5. Let S= {u,,...,u,,} be a finite orthonormal subset of an inner 
product space H, let x € Span(S), and write X; = (x,u;). Then 


n n 
a kimi and ||x||* = » |z;I°. 
i=1 i=1 


In particular, if (x,u;) = 0 for alll <i<n, thenx=0. 


Proof. Since x € Span(S), there are scalars ay, ...,a,, such that x = ee aju;. Fora 
fixed1<j<n, 


n 
Xj = (x, uj) => Yd akujs uj) = Gj. 
i=1 
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By the Pythagorean theorem, 


n n 
IIx? = >) M&iaall? = Dy 1&1. 
i=1 i=1 


Definition. Let M be a subspace of an inner product space H. The orthogonal 
complement of M is the set 


Mt ={zE€H: zl «Vx eM. 
It is clear that M+ is a subspace of H and that Mn M+ = {0}. 


Example 6. Let a=(qa,,...,a,)’ be a nonzero vector in R". The orthogonal 
complement of M = Span({a}) is the set of all vectors x such that a’x =0. 
Thus M+ can be viewed as the kernel of an 1 Xn matrix and is therefore a 
maximal subspace of R”. A translation of a maximal subspace of R” is called 
a hyperplane. Thus the equation of a hyperplane is of the form ae a,x; = b, 
b ER. Observe that the hyperplane a’x = b partitions R” into the three sets 
{xER" : a’x = bt, {xER": ax < bt and {x ER": ax > bh. The latter two 
sets are called the open half-spaces determined by the hyperplane a’x = b. @ 


Theorem 3.7.6. Let S= {u,,...,U,} be a finite orthonormal subset of H, and let 
M = Span(S). Then every vector x € H can be written uniquely as 


x=y+z, where y€ Mandze Mt. 


Additionally, y is the closest vector in M to x in the sense that if y' € Mandy’ #y, 
then ||x — yl] < ||x—y'll- 


Proof. Define y = aCe uj)u;, and let z=x—y. 
Clearly, y € M. We show that z € M+. For1<j<n, 


(Z, Uj) = (: - Ly Uj) Uj, u) 
= (x, uj) — Dy ui ut) = (x, uj) — (x, uj) = 0. 


This shows that H = M+ M+. To show the uniqueness part, suppose that x = y + 
z=y' +2’, where y,y’ EM and z,z' € Mt. Thus y—y' = z' —z. Sincey—y' EM 
and z —zé M+, y—y' € MnM! = {o}. 
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To prove the last assertion, suppose y’ € M, y' #y, and write x —y' =(x—y)+ 
(y—y’). Now x—y=ze Mt? and y—y’ € M. Using the Pythagorean theorem, 
we have ||x — y'||’ = |lx— yl? + lly—y'l? > Ilx—yl?. 


The vector y= De u,)u; is called the orthogonal projection of x on M. We 
reiterate that y is the closest vector of M to x. We also say that y is the best 
approximation of x in M. 


Sometimes the basis vectors u,, ...,u,, in the above theorem are merely orthogonal 
and not orthonormal. In this case, we find the orthogonal projection of x on M by 
using the formula 
n Uj 
= Dea ui)? 
which is the previously stated formula for y when each u; is replaced with the 
normalized vector a: 
uj 
The following question naturally imposes itself: does every finite-dimensional 
inner product space have an orthonormal basis? The following theorem delivers 
the answer. 


Theorem 3.7.7. Every finite-dimensional inner product space contains an orthonor- 
mal basis. 


Proof. We use induction on dim(H). Let {v,,...,V,} be a basis for H. Use the 
inductive hypothesis to find an orthogonal basis {u,,...,U,—,} for the inner 
product space Span({v,, ...,V,—1}), and define 


n-1 Us; 
J 

Un = Vn — Se “Tae 
j=l J 


Clearly, u, #0 because otherwise v, € Span({u,, ...,U,—1}) = Span({y, ..., 
Vn—1}). Observe that u,, is nothing but the difference between v,, and its orthogonal 
projection on Span ({uy, ...,Uy—}) (the vector z in the notation of theorem 3.7.6), 
and therefore u,, is orthogonal to each of the vectors u,,...,U,_,. By theorem 
3.7.4, the orthogonal set {u,, uy, ...,u,,} is independent and hence is a basis for 
H. To obtain the desired orthonormal basis, we simply normalize each of the 
vectors u;. 


The above theorem and its proof deliver more than the mere existence of an 
orthonormal basis for an arbitrary finite-dimensional inner product space. 
The proof is inductive and constructive; hence it can be applied to an infinite 
independent sequence of vectors, recursively, as follows. 
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The Gram-Schmidt Process 


Given an infinite sequence v,,v2,... of independent vectors in an inner product 
space, the sequence defined below is orthogonal: 
n—-1 uj 
u, = v,, and, for n >2,u, = V,— Day ELAS 
= a 
J 


Additionally, for each n EN, 
V, = Span({v;, .--,Vn}) = Span({uy, ...,U,}) = U,,. 


Example 7. Consider the space C[—1,1] with the inner product (fg) = 
Wf : flxdg(x)dx. Applying the Gram-Schmidt process to the infinite independent 
sequence of monomials 1,x,x?..., we obtain a sequence of orthogonal 
polynomials Py, P,..., that spans the space of polynomials such that 


Span({1,x, ...,x"}) = Span({Po,P,, ...,P,,}) for all n > 0. 


The polynomials P,, are known as the Legendre polynomials. We will study 
some of the properties of these polynomials in section 4.10. 


The following observation is sometimes crucial for avoiding the often cumbersome 
calculations needed to compute the orthogonal sequence u,, Us, ... 


If w,,W,... is an orthogonal sequence and, for each n EN, Span({v,,...,V,}) = 
Span({w,, ...,W,}), then each w, is a multiple of the corresponding u,. This is 
because the orthogonal complement Ue of U,,_, in V,, is one-dimensional, hence 
any nonzero vector in U*_, is a multiple of any other nonzero vector in Uy_,. The 
following example exploits this idea to generate the Legendre polynomials. 


Example 8. Consider the following set of polynomials: 
Q,(x) = D"[(x? — 1)"], 


where D"f denotes the n™ derivative of f. Each Q,, isa polynomial of exact degree 
n; thus Span({Po, inka) = San({Qo, iat) = Span({1,x, ee |). If we 
show that Q,, is orthogonal to each of the monomials x for0< j<n-—1, then 
the polynomials {Q,, : n € N} are orthogonal and, by the above observation, 
P,, = CnQny- 

Integration by parts yields 


1 1 1 


[ x/D"(x? — 1)"dx = -[ pe DI Ge = Dds +d = 1)" 
Zi a4 


-1 


VECTOR SPACES 93 


The second term is zero because if k <n, then x* — 1 is a factor of DK(x* — 1)". 
The same reason coupled with integration by parts j — 1 times proves the desired 
result. 


Example 9. Let A be a linear functional on a finite-dimensional inner product H. 
Then there exists a unique vector y € H such that, for every x € H, A(x) = (x,y). 
Let {u,,...,u,,} be an orthonormal basis for H, and define a; = A(u;). We 
claim that y = ye , 4u; is the desired vector. For a vector x € H, use theorem 
3.7.5 to write x = yy %,u;. On the one hand, 


A(x) = dy eACU) = EE 
i=1 i=1 


On the other hand, 


(%y) = (D) fii > GY) = > jaj{Uj, Uj) = yi = A(x). 
i=1 j=l i=1 


ij=1 


The Spectral Decomposition of a Normal Matrix 


The goal of this subsection is to derive the spectral decomposition of a normal 
matrix. An exact generalization of this decomposition is valid for compact self- 
adjoint (in fact, normal) operators on infinite-dimensional separable Hilbert 
spaces. 


For a (complex) matrix A, we use the symbol A* to denote the conjugate transpose 
of A. Thus (A*);; = a;;. The following theorem sums up the properties of conjugate 
transposition. We only verify part (c). 


Theorem 3.7.8. Let A and B be matrices of compatible sizes for matrix multiplica- 
tion. Then 


(a) A** =A, 
(b) (AB)* = B*A*, and 
(c) if A isan nxn matrix, then, for all x,y € C", (Ax, y) = (x, A*y). 


Proof. (x, A*y) = (A*y)*x = y"A™*x = y*Ax = (Ax,y). 
Definition. An nxn matrix P is said to be unitary if its columns form an 


orthonormal basis for C”. A real unitary matrix is specifically called an orthog- 
onal matrix. 
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Theorem 3.7.9. A unitary matrix P is invertible, and Pl=P*. 


Proof. Partition P by its columns, and write P = (u,,...,u,). Then 
P*P=| ° |(u,...,U,). 


Thus the (i,j) entry of the product P*P is u;u; = 6,,;. Hence P*P = I,,. 
The simplest example of an orthogonal (hence unitary) matrix is a rotation matrix. 


Example 10. For 0 € [0,277), the 2 x 2 matrix 


it ee a) 


sinO _cos@ 


is an orthogonal matrix. We show that the linear mapping induced by Pg is a 
rotation of the plane. Indeed, if we identify a point (x, y) € R* with the complex 
number z= x+ iy and write z= re, then multiplying z by e® produces the 
point z, = x, +iy,, which is the rotation of z through the origin by the angle 
8. Thus 


z, = ze = rel ® = r[cos(t + 0) + isin(t + 0)] 
= r[costcosO — sintsinO + i(sintcosO + costsin®@)]| 


= xcosO — ysinO + i(xsinO + ycos@). 


Equating the real and imaginary parts yields 


XY xcosO — ysinO x 
(*') ~ fae res ae ): : 
Rotation matrices are important because their obvious geometrical properties 
typify most of the general properties of a unitary matrix. For example, it is clear 
that a rotation of the plane preserves distances between vectors as well as angles 
between them. More specifically, ||Pgx|| = ||x|| and (Pax, Poy) = (x,y). This is the 
reason many people, including this author, loosely think of orthogonal matrices 
as rotations, although this is inaccurate, even in two dimensions. For example, the 
following matrix is orthogonal, but it is not a rotation matrix: 


Ge 
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The following innocent-sounding question leads to a whole set of interesting 
definitions and problems, in both finite and infinite-dimensional inner product 
spaces: which linear operators on R? can be diagonalized by simply rotating the 
axes? The question is exactly equivalent to the question of which matrices can be 
diagonalized by a rotation matrix. The immediate generalization is the question 
of which (complex) matrices can be diagonalized by a unitary matrix. To answer 
the question, suppose that a matrix A can be diagonalized by a unitary matrix 
P=(u,...,U,). Thus P~'AP = P*AP =D, where D is a diagonal matrix whose 
entries are the eigenvalues of A (see the proof of theorem 3.5.7). Then 


‘i ur Aus 
A= PDP* = (uy, ...,Uy) = "|= (uy, ...,Un) 
A : ; 
me Un AnUn 


Thus A = YG Ajuju; . Similarly, A* = ae Ajujus. 


Now AA* = YY) _ Aju (usu uj)uF = De iPujwt = AXA. 


ij=l 


The above calculation shows that a necessary condition for a matrix A to be 
unitarily diagonalizable is that A*A = AA*. Such a matrix has a name. 


Definition. A (complex) matrix A is called normal if A*A = AA”. A matrix A is 
called Hermitian if A* = A. A Hermitian matrix is clearly normal. Observe that 
a real Hermitian matrix is simply a symmetric matrix. 


Theorem 3.7.13 establishes the fact that normality is also a sufficient condition for 
the unitary diagonalization of a matrix. 


Lemma 3.7.10. If A is normal, then, for all x € C”, ||Ax|| = ||A*x||. 


Proof. || Ax||? — | A*x|]? = (Ax, Ax) — (A*x, A*x) 
= (A*Ax,x) — (AA*x, x) 
= ((A*A — AA*)x,x) = (0,x) = 0. 


Theorem 3.7.11. Let A be a normal matrix. Then a vector u is an eigenvector of 
A with the corresponding eigenvalue A if and only if u is an eigenvector of A* 
corresponding to the eigenvalue A. 


Proof. It is easy to verify that A—AI is normal and that its conjugate transpose is 
ye 8 By the previous lemma, ||(A — ADu|| = ||(A* —ADul|. Thus (A —ADu = 
0 if and only if (A* —ADu = 0. 
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Theorem 3.7.12. If A is a normal matrix, then eigenvectors of A corresponding to 
distinct eigenvalues are orthogonal. 


Proof. Suppose Au, = Au, Au, = Azuz, where u, FOF uy, and A, # Aj. Then 


Ay (Uy, Uz) = (Ay uy, Un) = (Auy, U2) 


= (u,,A*uy) = (uy,Apuy) = A,(uy, U2). 
Thus (A, —A,){uy, Uo) = 0, and (u,,u,) = 0. 


Theorem 3.7.13 (diagonalization). Let A be a normal matrix. Then there exists a 
unitary matrix P and a diagonal matrix D such that 


P*AP=D. 


Proof. The proof is inductive. The base case (n=2) is left as an exercise. Let A, be 
an eigenvalue of A, and let v, be a unit eigenvector corresponding to A,. Let 
M = Span({v;}), and let {v>, ...,V,} be an orthonormal basis for M+. By construc- 
tion, the matrix Q=(,...,V,) is unitary. 

We claim that Q* AQ has the form 


‘The (i,j) entry of Q* AQ is e! Q*AQe;.4 But, forl <i<n, 


e/ QM AQe, = ef Q* Ay, = Ae Q*v, = A,(Qe;)*v) = Avi Vy = A16;1- 


Thus the entries in the first column of Q*AQ are what we claim they are. 
The entries in the top row are computed similarly by examining the quantity 
e| Q*AQe;, and using the fact that A*v, = AY (from theorem 3.7.11). 

Next we show that the matrix Q* AQ is normal: 


(VAQKQVAQ)* = VAQVA*Q = VAA*Q = QVA*AQ 
= VA*QVAQ = (VAQ)*(Q"AQ). 


* Here {e,, ...,e,,} is the standard basis for K". 
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Now 
Agee Oe ass. 70 
(PAQVAY™ =] gaye 
0 
while 
|A,/? |} 0 .. 0 
(PAW AD=| P| gna ge 
0 


This shows that A’ is normal. 
Invoking the inductive hypothesis, there is a unitary (n—1)X(n—1) matrix 
Q, such that 


Ay 
QiA’Q, = 
An 
Now define 
1 [0 0 
0 
P=Q] . 
: Q 
0 


Being the product of two unitary matrices, P is unitary. It is straightforward to 
verify that 


P*AP= *s, =-D.8 


Remarks. 1. If we write P=(u,,...,u,,), then, by the proof of theorem 3.5.7, 


the eigenvalues of A are 2,,...,A,, and u,,...,u, are the corresponding 
eigenvectors. 

2. Observe that A = PDP* = Di Amin’. Each of the matrices P; = uju; is a 
rank 1 matrix and, in fact, P; is the projection of C” onto the one-dimensional 
subspace generated by u;, because P;x = (u;u;)x = u;(uj x) = (x,u;)u;. The 
representation 


A=>)_ AP; 


is known as the spectral decomposition of the normal matrix A. 
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Spectral Theory for Normal Operators 


Let H be a finite-dimensional inner product space, and let T be a linear operator 
on H. For a fixed element y € H, define a functional A, on H by A,(x) = (Tx,y). 
It is clear that A, is linear. By example 9, there is a unique vector T*y € H such 
that A,(x) = (x, T*y). Therefore we have a function T* : H > H defined by the 
requirement that 


(Tx, y) = (x, Ty) 
for all x,y € H. 


It is straightforward to verify that T* itself is a linear operator on H. We call it the 
adjoint operator of T. 


Definition. A linear operator T on a finite-dimensional inner product H is said to 
be normal if T*T = TT’. We say that T is self-adjoint if T= T*. A self-adjoint 
operator is clearly normal. 


We will develop the analog of the spectral decomposition of a normal matrix for 
normal operators. 


Lemma 3.7.14. Let T be a normal operator on a finite-dimensional inner product 
space H, and let B = {v,,...,v,} be an orthonormal basis for H. Then the matrix 
A of T relative to B is a normal matrix. 


Proof. By theorem 3.5.2, it is sufficient to prove that the matrix of T* relative to B 
is A*. By assumption, T(v,) = ame av; for 1<k<n. We need to show that, 
forl<j<n, T(y)- ie iM = 0. It is further sufficient to show that, for all 


1 <j,k <n, the quantity qj. = (T*(vj) — pie GjiVin Yq) is Zero: 


die = (Ts VE) = Davin Va) = (Vp TH) — 
i=] 


n n 
= (y,2oaum) =Ah = DY Fil Y;s Vi) — Be = Fin — Fix =0.m 
i=1 i=1 


Theorem 3.7.15. A normal operator T on a finite-dimensional inner product space 
H is diagonalizable. 


Proof. Fix an orthonormal basis B = {v,, ...,v,} for H, and let A be the matrix of 
T relative to B. By the previous lemma, A is normal and, by theorem 3.7.13, A is 
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diagonalizable by a unitary matrix P. Thus P* AP = D. Let B’ be the basis of H 
such that P is the matrix from B' to B. Such a basis exists by example 4 in section 
3.5. By theorem 3.5.5, the matrix of T relative to B’ is P-!AP= P*AP=D, as 
desired. We leave it to the reader to verify that B’ is, in fact, an orthonormal basis 
forH.@ 


The above theorem leads to the spectral theorem for normal operators on finite- 
dimensional inner product spaces. 


Theorem 3.7.16 (the spectral theorem). A normal operator T on a_finite- 
dimensional inner product space can be written as 


n 
T = APs 
i=] 
where A, ...,A,, are the eigenvalues of T, and P,,...,P, are rank 1 operators. 


Proof. In the notation of the previous theorem, let u,,...,u, be the columns of 
the matrix P, and let A,,...,A,, be the diagonal entries of D. Then T(u;) = Aju;- 
Write an arbitrary element x of H as x = yee Then T(x) = ae £;T(uj) = 
in Ait. Define P;(x) = Xu; = (x,uj;)u;. Each of the operators P; is the 
projection of H onto the one-dimensional subspace generated by u; and 
T=)_,AP;. 


Exercises 


1. For functions f,geC~[0,1], define (fg) = f(0)g(0) + Sf, f' Og’ @dx. 
Prove that (.,.) is an inner product on C®[0, 1]. 

2. Prove that the following generalization of the previous exercise also defines 
an inner product on C*[0, 1]: for a fixed positive integer n, 


= FOO + i f(g Cde. 
i=0 0 


3. Prove the following properties of inner products, which are often used 
without explicit mention: 
(a) If x,y are vectors in an inner product space H such that 
(x, w) = (y,w) for every w € H, then x = y. 
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10. 


11. 


12. 


13. 


14. 
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(b) For uy, ...,Uy, Vy. -++>Vm © Hand for scalars a), ...,Gn58 15 ---> Bmp 


(Yam, 8m) =D Yalu) 
j=l 


i=1 i=1j=1 


(c) If uw, ...,u,, are mutually orthogonal, then 


2 n 
= >) lol? llell?. 
i=1 


n 
2, ei 
i=] 


. Prove that, in an inner product space, x L y if and only if ||x + ay|| = ||x— 


ay|| for every a € K. 


. Let u,, U5,... be an infinite orthonormal sequence in an inner product space 


H. Prove that, for x € H, Yy~_, |u|? < ||x||?. Here %, = (x, un). 


. Prove that if M is a subspace of an inner product space H, then M+ is a 


subspace of H, and Mn M+ = {0}. 


. Let M be a finite-dimensional proper subspace of an inner product space H. 


Prove that M+ # {0}. In particular, there exists a unit vector x orthogonal 
to M. 


. Consider the space M = K(N) of finite sequences. Clearly, M is a subspace 


of ?. Prove that M+ = {0}. 


. For real square matrices A and B, define (A, B) = tr(AB’). Prove that ¢.,.) 


is an inner product on R,,..y. 

This is a continuation of the previous exercise. Prove the orthogonal 
complement of the space of symmetric matrices is the subspace of skew- 
symmetric matrices. 

Let M be a proper subspace of R”, and let r = dim(M). Prove that there 
exists an (n—r)Xn matrix A whose null-space is M. What additional 
properties can A have? 

QR factorization. Prove that every real invertible matrix A can be written 
as A = QR, where Q is an orthogonal matrix, and R is an upper triangular 
matrix. Hint: Apply the Gram-Schmidt process to the columns a), ...,ay, 
of A to find an orthonormal basis q,,...,q,. For 1<i<n, a; is a linear 
combination of qj, ..., qj. 

Let P be an orthogonal matrix. 

(a) Prove that (Px, Py) = (x,y) for every x,y € R”. 

(b) Prove that ||Px|| = ||x|| for every x € R”. 

Let P be an orthogonal matrix. 

(a) Prove that P? is orthogonal. 

(b) Prove that det(P) = +1. 

(c) Prove that the product of two orthogonal matrices is orthogonal. 


15. 
16. 


17. 
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Prove theorem 3.7.13 when n = 2. 

Refer to theorem 3.7.16. Prove that the operators P; satisfy 

(a) 4 P; =I (the identity operator on H), and 

(b) P,P; = 6;P;. 

Find an orthogonal matrix that diagonalizes the matrix A below, and write 
down the spectral decomposition of A: 


4 
The Metric Topology 


A mathematician who is not also something of a poet will never be a perfect 
mathematician. 
Karl Weierstrass 


Felix Hausdorff. 1868-1942 


Felix Hausdorff was born into a wealthy Jewish family and when he was still 
a young boy, the family moved to Leipzig. He studied at Leipzig University, 
graduating in 1891 with a doctorate in the applications of mathematics to 
astronomy. He published four papers on astronomy and optics over the next 
few years. Hausdorff remained in Leibzig, where he lectured until 1910. He then 
moved to Bonn, then to Greifswald in 1913, returning to Bonn in 1921, where he 
continued his work until 1935. 


Hausdorff was the first to coin the definitions of metric and topological spaces. In 
1914, building on work by Maurice Fréchet and others, he published his famous 
text Grundziige der Mengenlehre. The book was the beginning point for studying 
metric and topological spaces, which are now core topics in modern mathematics. 
Among Hausdorff’s numerous achievements, we count his introduction of the 
notion of the Hausdorff dimension, his study of the Gaussian law of errors, limit 
theorems and the problem of moments, and the strong law of large numbers. He 
introduced the concept of a partially ordered set and, from 1901 to 1909, he proved 


Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules. 
DOI: 10.1093/0s0/97801 98868781 .003.0004 
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a series of results on ordered sets. In 1907 he introduced special types of ordinals 
in an attempt to prove Cantor’s continuum hypothesis, and he was also the first to 
pose the generalized continuum hypothesis. 


Hausdorff sensed the oncoming calamity of Nazism but made no attempt to 
emigrate while it was still possible. Although he swore the necessary oath to Hitler 
in 1934, he was forced to give up his position in 1935. He continued to undertake 
research in topology and set theory but the results could not be published in 
Germany. As a Jew, Hausdorff’s position grew increasingly more difficult. He 
lived under the constant threat of being deported to an internment camp. Bonn 
University requested that the Hausdorffs be allowed to remain in their home, and 
the request was granted. But, in 1941, they were forced to wear the “yellow star’, 
and, in January 1942, the Hausdorffs were informed that they were to be interned 
in Endenich. Together with his wife and his wife’s sister, Hausdorff committed 
suicide on 26 January. 


Hausdorff was, according to a quote attributed to Weierstrass, a perfect mathe- 
matician. Indeed, he was something of a poet, according to the following excerpt:’ 


Hausdorff pursued, especially during the early years in Leipzig, a kind of 
double identity: as Felix Hausdorff, the productive mathematician, and as 
Paul Mongré. Under this pseudonym, Hausdorff enjoyed remarkable recog- 
nition within the German intelligentsia at the end of the 19th century as 
a writer, philosopher and socially critical essayist. He fostered a circle of 
friends that consisted of well-known writers, artists and publishers including 
Hermann Conradi, Richard Dehmel, Otto Erich Hartleben, Gustav Kirstein, 
Max Klinger, Max Reger and Frank Wedekind. Between 1897 and 1904, 
Hausdorff reached the peak of his literary-philosophical accomplishment: 
during this period, 18 of a total of 22 works were published under his 
pseudonym. These included the volume of aphorisms Sant’ Ilario: Thoughts 
from Zarathrustra’s Country, his critique Das Chaos in kosmischer Auslese, 
a book of poems entitled Ekstases, the farce Der Arzt seiner Ehre, as well as 
numerous essays, most of which appeared in the leading journal of the day, 
“Neue Deutsche Rundschau (Freie Bithne)”. The play was Hausdorff’s greatest 
literary success, as it was performed over 300 times in 31 cities. 


’ Excerpted from Hausdorff Center for Mathematics, Felix Hausdorff, http://www.hcm.uni-bonn. 
de/about-hcm/felix-hausdorff/about-felix-hausdorff/, accessed Oct. 29, 2020. 
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4.1 Definitions and Basic Properties 


Basic calculus concepts such as limits and continuity are heavily based on the 
concept of proximity. A metric is the most common tool for measuring proximity. 
The definition of a metric is a direct abstraction of the properties of the distance 
function in the plane. The most important characteristics of the Euclidean distance 
are: 


(1) the distance between two points in the plane is positive, 
(2) the distance is a symmetric function, and 
(3) the triangle inequality as it is understood in plane geometry. 


These three characteristics are the ingredients of the definition of a metric. It is 
an amazing fact that so few axioms produce such a rich structure. The abstraction 
of a simple concept almost never produces a structure with properties identical 
to those of the concept. Indeed, there are fundamental differences between the 
properties of a general metric and those of the Euclidean distance. For example, 
you will see that there are metrics where a ball consists of a single point or the 
entire space. Of course, such metrics generally have much less importance than the 
most common metrics, those induced by a norm. Thus the fact that some metric 
properties violate our sense of geometry does not detract from the usefulness of 
metric spaces as one of the most powerful tools of mathematics. 


Definition. A metric space is a nonempty set X together with a function d : Xx 
X —> R such that, for all x,y and z € X, 


(a) d(x, y) = 0, and d(x, y) = 0 if and only ifx = y, 
(b) d(x, y) = d(y,x), and 
(c) d(x,y) < d(x,z) + dy) 


The function d is called the distance function, or the metric. Property (c) is 
known as the triangle inequality. 


Example 1. Let X = R, and let d(x, y) = |x — y|. The triangle inequality is indeed 
the inequality known by the same name in elementary mathematics. In general, 
the metric on R" given by d(x, y) = es |x; —y?|)"”? = |lx —ylz is called the 
Euclidean (or the usual) metric on R”. In this case, the triangle inequality 


follows from Minkowski’s inequality with p = 2. 


Example 2. Normed linear spaces provide a rich source of metric spaces. If (X, ||.||) 
is anormed linear space, and the distance function is defined by d(x, y) = ||x — 
yl|, then d(x, y) = ||x— yll = |@—2z) + @—y)II S Ile — ll + [le—yll = 42) + 
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d(z, y), and this proves the triangle inequality. The other properties are trivial to 
verify. Special cases include all ? spaces, R” with any of the /? metrics, and the 
space B[0, 1]. See section 3.6. @ 


Example 3. Let X be a nonempty set and define the discrete metric on X as 
follows: 


1 ifx#y, 


d(x,y) = 
(y) 0 ifx=y.¢ 


Definition. Let (X,d) be a metric space. The open ball of radius r centered at 
x € X is the set 


Bix,r) ={yEX: d(x, y) <r}. 


The special case of this definition stated in section 3.6 when X is a normned linear 
space is consistent with the above definition. 


Example 4. In R with the usual metric, B(x,r) is the open interval of radius r 
centered at x. @ 


Example 5. In (R’,||.||2), the open ball of radius r centered at (x, yo) is the open 
disk of radius r centered at (xp, Vo). @ 


Example 6. In the space B[0, 1] of bounded real functions on [0, 1], the ball B(f,r) 
is the set of all bounded functions whose graphs on [0, 1] are between the graphs 
of the functions y = f(x) —randy=f(x)+r.@ 


Example 7. In the discrete metric on a set X, B(x,r) = {x} ifr < 1, and B(x,r) = X 
ifr>1.¢ 


Definition. A subset of a metric space X is said to be open if it is the union of 
open balls. 


Example 8. An open ball in any metric space X is an open subset of X. 
Example 9. Consider the discrete metric on a nonempty set X. Single-point 
subsets of X are open because B(x, 1) = {x}. It follows that every subset of X 


is an open set since a set is the union of its single-point subsets. @ 


Example 10. In R with the usual metric, the interval (0, co) is open since (0, co) = 
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Theorem 4.1.1. A subset U of a metric space X is open if and only if, for every x € U, 
there exists a positive number 6 such that B(x,6) C U. 


Proof. Suppose U is open, and let x € U. Since U is the union of open balls, there 
exists a ball B(y,r) in X such that x € B(y,r) C U. Since d(x, y) < 1, the number 
6 =r—d(x,y) is positive. We show that B(x,d) C Bly,r). Let z € B(x,d). Then 
d(z,y) < d(z,x) + d(x, y) < d(x,y) +6 =r. Conversely, if, for every x € U, there 
is a positive number 6, such that B(x,6,.) € U, then U= U,<yB(x, 6,). 


Example 11. In R’, the first quadrant U={(x,y) : x>0,y> 0} is open. If 
(xo, ¥o) € U, then the disk centered at (xp, yo) of radius 6 < min(x9, yo) is 
contained in U. @ 


Theorem 4.1.2. Let X be a metric space. Then 


(a) The union of an arbitrary collection of open subsets of X is open. 
(b) The intersection of two (hence any finite number of) open sets is open. 
(c) X is open. 


Proof. (a) Let {Ug} be a collection of open sets of X, and let U=U, Ug. If x EU, 
then x € Ug, for some a. Therefore there exists 5 > 0 such that B(x,6) € Ug € U. 
Thus U is open by theorem 4.1.1. 


(b) Let U and V be open subsets of X, and let x € UN V. By theorem 4.1.1, there 
exist positive numbers 5, and 5, such that B(x,6,) C U, and B(x,5,) C V. Let 
65 = min{6,, 65}. Clearly, B(x,6) C UN V. Again, by theorem 4.1.1, UN V is open. 


(c) Fix an element x € X. Clearly, X = UR, B(x,n). 
By definition, the empty set is also an open subset of any metric space. This is 
largely a useful convention. For example, the statement of theorem 4.1.2 (b) should 
read: “The intersection of two open sets is open or empty.” If we declare the empty 


set to be open, the statement as it stands is correct. 


Definition. A subset F of a metric space X is closed if its complement X — F is 
open. 


Example 12. In R with the usual metric, [a,b], [a, 00), and U,,<z[2n,2n + 1] are 
all closed sets, as the complement of each set is open. @ 


The theorem below follows from theorem 4.1.2 and De Morgan's laws. 
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Theorem 4.1.3. Let X be a metric space. Then 


(a) X and @ are closed. 
(b) A finite union of closed sets is closed. 
(c) The intersection of an arbitrary collection of closed sets is closed. 


Theorem 4.1.4. Let x and y be distinct elements of a metric space X. Then there exist 
open sets U and V containing x and y, respectively, such that UNV = @. 


Proof. Let 5 = d(x,y), and set U = B(x,6/2) and V = B(y, 6/2). We show that UN 
V=@. Ifze UNV, then d(x, y) < d(x,z) + d(z,y) < 6/2 + 6/2 = 6, which is a 
contradiction. 


The property established by the above theorem, namely, that distinct points in a 
metric space are contained in disjoint open sets, is called the Hausdorff property. 
This is an important separation property of metric spaces. Common terminology 
used to describe the Hausdorff property is that distinct points in a metric space 
can be separated by disjoint open sets. 


Definition. A neighborhood of a point x of a metric space X is a subset of X that 
contains an open subset of X that contains x. A neighborhood of a point need 
not be open. The concept is sometimes helpful in economizing on verbiage. 


Now you see that distance functions are not created equal. An open neighborhood 
of a point in the discrete metric is either very small (a single point) or very large 
(the whole space), while the collection of neighborhoods of a point x in a normed 
linear space includes all the open balls centered at x and is therefore quite rich. 
There is another important distinction between a general metric and the metric 
generated by a norm. In the latter case, the collection of open neighborhoods 
of a point is exactly the translation of the collection of open neighborhoods 
of any other points. Thus the neighborhoods of a point are identical (up to a 
translation) to the neighborhoods of any other points. The open neighborhoods 
in a general metric space can be quite heterogeneous in the sense that knowledge 
of the neighborhoods of one point tells us nothing about the open neighborhoods 
of other points. 


Definition. Let (x,,) be a sequence in a metric space X, and let x € X. We say that 
(x,,) converges to x if lim,,d(x,,x) = 0. In this case, we write lim, x,, =x. We 
also say that x is the limit of (x,,). Observe that if X is a normed linear space, 
lim,, x, = x is equivalent to the condition that lim, ||x,, — x|| = 0. 


Theorem 4.1.5. The limit of a convergent sequence (x,) is unique. 
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Proof. Suppose that lim, x, = x, lim, x, = y, and x # y. By the Hausdorff property, 
there exists 6 > 0 such that B(x,6)N Bly,d) = ©. There exist natural numbers 
N, and N, such that, for n> N,,d(x,,x) < 6, and, for n> Nb», d(x,,y) < 6. If 
we choose an integer n> max{N,,N)}, then d(x,,x) <6 and d(x,,y) <6, and 
x, € B(x,d)/N By, 6) = @, which is a contradiction. 


Convergence in the spaces (B[0,1],|].||,,.) and (C[0,1],||.||,.) is equivalent to 
uniform convergence. Explicitly stated, a sequence (f,,) of bounded (respectively, 
continuous) functions converges in the uniform norm to a bounded (respectively, 
continuous) function f if, for ¢ > 0, there exists a positive integer N, dependent 
only on €, such that, for all x € [0,1] and all n > N, |f,(«) —f(x)| < €. Clearly, the 
pointwise convergence of (f,,) to fis necessary for its uniform convergence to f. The 
following two examples illustrate that the converse is not true. 


Example 13. Let 


nx if0<x<1/n, 
1 ifl/jn<x<1. 


tioe| 
The pointwise limit of the sequence (f,,) is clearly the bounded function 


0 ifx=0, 
1 if0<x<l. 


0)=| 


However, the sequence (f,,) does not converge to fin B[0, 1] because, for every 


n EN, | If —flloo = Uf1/(2n)) — f1/(2n))| = [1/2 - 1] = 1/2. 


Example 14. Let f,(x) = STATE Clearly, 0 < f,(x) < 1, and lim, f,(x) = 0 for 
nm\x—lin 
all x € [0, 1]. However, f,, does not converge to the zero function in the uniform 


norm since ||fi|lo =f,C1/n) = 1.4 
The following theorem is occasionally useful. Its proof is left as an exercise. 
Theorem 4.1.6. Let (x,,) be a sequence in a metric space X, and let x € X. If every 
subsequence of (x,,) contains a subsequence that converges to x, then (x,,) converges 
to x. 


Exercises 


1. Show that the intersection of an arbitrary collection of open sets need not 
be open. 
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2. Show that an arbitrary union of closed sets is not necessarily closed. 

3. Show that a single-point subset of a metric space X is closed. Hint: Use the 
Hausdorff property. Conclude that an arbitrary subset of X is the union of 
closed sets. 

4. Show that |d(x,z)—d(y,z)| <d(x,y). Hence show that if lim, x, =x, 
lim, y, = y, then lim, d(x,,y,) = d(x, y). 

5. Prove that a convergent sequence in a metric space is bounded. 

6. If, in a normed linear space X, lim, x, = x,lim,y, = y, and (a,) and (b,) 
are scalar sequences that converge to aand b, respectively, then lim,,(a,,x, + 
byVn) = ax + by. 

7. Prove that if lim, x, = x, then every subsequence of (x,,) converges to x. 

8. Prove theorem 4.1.6. 

9. Show that lim, x, =x if and only if every neighborhood of x contains all 
but finitely many terms of (x,,). 

10. Give an example of a normed linear space that contains an uncountable 
number of mutually disjoint balls of equal radii. Hint: Let A be the subset 
of I of all binary sequences. What is the distance between any pair of points 
in A? 

11. Prove that the sphere 8"! = {x € R" : ||x||, = 1} is a closed subset of R”. 


4.2 Interior, Closure, and Boundary 


The notions of interior, closure, and boundary are quite familiar, and their mean- 
ing is rather obvious for simple sets. For example, the interior of the closed 
disk D={x € R? : ||x||, < 1} is the open disk U= {x € R? : ||x||, < 1}, and the 
boundary of D is the unit circle. The fact that a concept is intuitively obvious 
is no substitute for a definition. It is often the case that the definition of a 
familiar concept deepens our realization that familiarity and simplicity are not 
synonymous. You will see in this section that the interior of Q is empty, that its 
boundary is the entire real line, and that important subsets of R, such as the Cantor 
set, can come in infinitely many fragments. Intuitively speaking, one expects the 
definitions to capture the ideas that an interior point of a set A must be completely 
surrounded by points of A and that a boundary point of A falls on the edge of A. 
Thus any ball centered at a boundary point of A falls partially inside A and partially 
outside it. This section formulates precise generalizations of those concepts. We 
will also see that disjoint closed sets can be separated in much the same way that 
the Hausdorff property separates points. 


Definition. Let A be a nonempty subset of a metric space X. The interior of A, 
denoted int(A), is the union of all the open subsets of X contained in A. A point 
of int(A) is called an interior point of A. 
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Example 1. The interior of a nonempty subset may well be empty. The simplest 
example is the subset Q of the metric space R; int(Q) = @ because Q contains 
no open intervals and hence no open subsets of R. @ 


The proofs of the following two theorems are straightforward. 


Theorem 4.2.1. The interior of a subset A is the largest open subset of X contained 
in A. A subset A is open if and only if int(A) = A. Finally, if A C B, then int(A) 
int(B). 


Theorem 4.2.2. Let A be a nonempty subset of X, and let x € X. Then x € int(A) if 
and only if there exists 5 > 0 such that B(x,6) C A. 


The above theorem captures what it means for an interior point of A to be totally 
surrounded by points of A. In fact, the statement of theorem 4.2.2 can be taken as 
the definition of an interior point x of A. One can then define the interior of A to 
be the set of all interior points of A. 


Definition. Let A bea subset of a metric space X. The closure, A, of A is the inter- 
section of all the closed subsets of X containing A. Points of A are called closure 
points of A. Since X is closed and it contains A, the closure of a nonempty set 
is nonempty. The following theorem is an immediate consequence of theorem 
4.1.3. 

Theorem 4.2.3. The closure, A, of A is the smallest closed subset of X containing A. 
A subset A of X is closed if and only if A = A. Finally, if A C B, then AC B. 


Theorem 4.2.4. Let A be a nonempty subset of X, and let x € X. Then x € A if and 
only if for every 5 > 0,AN B(x, 6) # ©. 


Proof. Suppose, x ¢ A. Then x € X— A, which is open. Thus there exists 5 > 0 such 
that B(x,5) C X—A. In particular, B(x,6)N A = ©. Conversely, if for some 5 > 
0, B(x,6) NA = @, then A C X— B(x, 4), which is closed, so AEX= B(x, 6) = 
X — B(x, 6). In particular, x ¢ At 


Example 2. In R with the usual metric, Q = R. This is because every open interval 
in R contains rational points. ¢ 


Example 3. (The Comb) Consider the following subset of R? : 
A= (ees (23> [0, 1) The line segments {x [0,1] are called the teeth of 


A. We claim that A = AU ({0}x[0,1]). Since A is contained in the closed unit 
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square S = [0,1] x [0,1], ACS. Any point in S that does not belong to A or the 
line segment {0} x [0, 1] must lie strictly between two consecutive teeth, and a 
small-enough disk centered at the point is strictly contained between the two 
teeth. Finally any disk centered at a point on {0} x [0, 1] intersects all the teeth 
from some point 7 on. @ 


Theorem 4.2.5. Let A be a nonempty subset of X, and let x € X. Then x € A if and 
only if there exists a sequence (x,) in A such that lim, x, = x. 


Proof. Suppose lim, x, =x, where each x, € A, and let 6 > 0. There exists a natural 
number N such that, for all n> N,d(x,,x) < 6. In particular, xy € B(x,6)N A; 
thus x € A, by theorem 4.2.4. Conversely, suppose x € A. By theorem 4.2.4, 
B(x, 1/n)NA#@ for all nEN. Choose a point x, € B(x,1/n)NA. Clearly, 
lim, x, =x. 


Definition. Let A be a subset of a metric space X. A point x € X is called a limit 
point of A if, for every 6 > 0, B(x, 5) NA contains a point other than x. A point 
of A that is not a limit point of A is called an isolated point of A. 


Observe that a limit point of A need not belong to A. A point x of A is isolated if 
and only if there is 5 > 0 such that B(x,d)N A = {x}. 


Example 4. In R? with the usual metric, points on the unit circle {(x,y) : x? +? = 
1} are limit points of the open unit disk {(x,y) : x+y’ <1}. @ 


Example 5. In R, every point of N is an isolated point of N. A little reflection shows 
that every point of the set A = {+ : n N}is an isolated point of A. ¢ 
n 


Theorem 4.2.6. If A is a nonempty subset of X, and x € X, then x is a limit point 
of A if and only if there exists a sequence (x,,) of distinct points of A such that 
lim, x, = x. 


Proof. Let x be a limit point of A. There exists a point x, € A such that 0 < d(x,,x) < 
1. Let 6, = min{d(x,,x), 1/2}. There exists a point x, € A such that 0 < d(x>,x) < 
64. Note that x. #x, by construction. The rest of the construction is inductive. 
Having found points x,,...,x, such that 0 < d(X,,x) < ... < d(x,,x) such that 
d(x;,x) < 1/i, let 6,, = min{—,d(Xy,%)} then choose a point x, € A such that 
0 < d(X,41,x) < 6,. Clearly, lim,, x, = x. The converse is straightforward. 


Definition. The derived set of a subset A of a metric space X, denoted by A’, is 
the set of all limit points of A. 
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Theorem 4.2.7. A= AUA’. Thus A is closed if and only if it contains all its limit 
points. 


Proof. By theorems 4.2.5 and 4.2.6, A’ C A, and, by definition A C A. Thus AUA' C 
A. Now suppose x € A and that x ¢ A. Since x € A, B(x,d) NAF @ for every 
6 > 0. Because x ¢ A, B(x,6)MA contains a point of A other than x. This makes 
x a limit point of A, by definition. 


Definition. The boundary of a subset A of a metric space X is the set 
dA = ANX~—A. Points of 0A are called the boundary points of A. Observe that 
x € OA if and only if every neighborhood of x intersects both A and X— A. 


Example 6. In R with the usual metric, 0@=R. This is because every open 
interval in R contains both rational and irrational points. 


Theorem 4.2.8. A = int(A)UQ@A. 

Proof. It is enough to show that A C int(A) UQA. The reverse containment is obvi- 
ous. Let x € A—OA. Since every open ball centered at x intersects A, and since 
x € OA, there exists an open ball B(x, 6) that does not intersect X — A. This means 
that B(x,6) C A; hence x € int(A). 

Theorem 4.2.9. A = AUQA. 


Proof. By the previous theorem, A = int(A)U@A CAUGAC A. 


Definition. Let A be a nonempty subset of a metric space X, and let x € X. The 
distance from x to A is dist(x, A) = inf{d(x,a) : a € A}. 


Definition. Let A and B be nonempty subsets of a metric space X. The distance 
between A and B is dist(A, B) = inf{d(a,b) : a€ A,b € B}. 


Observe that dist(x, A) and dist(A, B) are always finite numbers. 


Example7. LetA ={n EN: n>2}andB={n+ Sons 2}. Then dist(A, B) = 0. 


To see this, observe that a, =n € A, that b, =n+ * © B, and that la, — b,| = 
n 
l/n>Oasn>ow.¢ 


Theorem 4.2.10. Let A be a nonempty subset of a metric space X. Then A ={x € 
X : dist(x,A) = 0}. 
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Proof. Suppose x € A. By theorem 4.2.5, there exists a sequence (x,) in A such that 
lim,, x, =x. Thus dist(x, A) = inf{d(x,a) : a € A} < d(x,,x) for alln EN. Since 
lim,, d(x,,,x) = 0, dist(x, A) = 0. Conversely, if dist(x,A) = 0, then there exists a 
sequence of points x, € A such that lim, d(x,,x) = 0. Thus lim, x, = x, andx € A 
by theorem 4.2.5. 


Definition. Let A be a nonempty subset of a metric space X. The diameter of 
A, diam(A) = sup{d(x,y) : x,y € A}. If diam(A) is finite, we say that A is a 
bounded subset of X. A sequence (x,,) is said to be bounded if its range, {x,,}, 
is a bounded set. If diam(X) < cv, we say that d is a bounded metric. 


Theorem 4.2.11. diam(A) = diam(A). 


Proof. Clearly, diam(A) < diam(A). To prove that diam(A) < diam(A), letx,y€ A. 
We will show that d(x,y) < diam(A). There exist sequences (x,) and (y,) in A 
such that lim, x, =x, lim, y, = y. Now d(x, y) < d(x, x,)+ dV +d Vay) < 
d(x,,,x) + diam(A) + dly,,,y). The desired inequality follows from the above string 
of inequalities by taking the limit as n > oo. 


Separation by Open Sets 


Separation is a central idea in topology and analysis, and its importance cannot be 
exaggerated. The Hausdorff property is the simplest form of separation. We will 
see below that closed sets can be separated in much that same way points can be. 


Theorem 4.2.12. Let F be a closed subset of X, and let x € X — F. Then there exist 
open subsets U and V such that x € U, FC V, and UNV= @. 


Proof. Since x € X — F, which is open, there exists 6 > 0 such that B(x,d) C X—F. 
For every y € F,d(x,y) > 6; hence B(x, 6/2) N Bly, 6/2) = ©. The open sets U= 
B(x, 6/2) and V = U{B(y, 6/2) : y € F} satisfy the conclusion of the theorem. 1 


An alternative (and commonly used) terminology to summarize theorem 4.2.12 is 
to say that there are disjoint open subsets that separate a closed subset of X and a 
point outside it. 


Theorem 4.2.13. Let E and F be disjoint closed subsets of a metric space X. Then E 
and F can be separated by disjoint open subsets of X. 
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Proof. We need to find disjoint open subsets U and V such that E C U and FC V. 
For x GE, dist(x,F)>0, by theorem 4.2.10. Let 5, = dist(x,F). By the proof 
of theorem 4.2.12, for every y € F, B(x, 6,/2)N Bly, 6,/2) = @. For y € F, let 
6, = dist(y,E) > 0. By the proof of theorem 4.2.12, for every x € E, B(x,6,/2) 
Bly, 6,/2) = ©. Let U = xen B(x, 6,/2), and V = U,epB(y, 6,/2). Clearly, U and 
V are open, EC U, and VC V. It remains to show that U and V are disjoint. 
If ze UNV, then z € B(x, 6,/2)N BY, 6,/2) for some x EE and y EF. Now, 
A(x, y) < d(x,z) + d(z,y) < 6,/2 + 4,/2 < maxt{6,, 6,}. But d(x,y) > dist(x, F) = 
6, and d(x,y) = dist(y, E) = 6,. We have arrived at a contradiction that proves 
the theorem. 


Example 8. Let E={(x%,y)€R?: x>0,y>1/x*}, and F={(x,y)ER?:x< 
0,y > 1/x?}. Clearly, E and F are disjoint closed subsets of the plane. They 
are separated by the open right and left half planes. 


Subspaces 


Let (X,d) be a metric space, and let A be a subset of X. The defining conditions of 
the metric are clearly satisfied by the elements of A. Since the distance function is 
the only defining characteristic of a metric space, the pair (A, d) is a metric space 
in its own right. We say that (A, d) is a subspace of (X,d), and the metric d on A is 
called the restricted (induced, or subspace) metric. 


If A is a subspace of a metric space X, we use the notation B,(x, 5) to denote the ball 
in A of radius 6 centered at a point x € A. Thus By(x,6) = {x EA : d(x,a) < dh. 
We use the notation B, to denote the closure of a subset B of A in the restricted 
metric. 


Theorem 4.2.14. Let A be a subspace of a metric space X, and let BC A. Then 


(a) Ba(x,6) = B(x, 6) NA. 

(b) Bis open in the restricted metric on A if and only if there exists an open subset 
U of X such that B= UNA. 

(c) Bis closed in the restricted metric on A if and only if there exists a closed subset 
E of X such that B= ENA. 


Proof. We prove part (b) and leave the rest of the statements to the reader. If B is an 
open subset of A, then B is the union of open balls in A. Thus B = Uge;Ba(Xq, Sq). 
By part (a), Ba(xq,5q) = BlXe,5q) NA; thus, B= Ueer|B(xa, ba) nA a [ Uger 
B(Xqs 6a.) | VA = UNA, where U = UgeB(%qs5q), which is open in X. We leave 
the proof of the converse as an exercise. 1 
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The Cantor Set 


Consider the closed unit interval I = [0,1]. Trisect Iand remove the open middle 
third (1/3,2/3). This leaves two closed intervals: I,,, = [0, 1/3] and I,,. = [2/3, 1]. 
Let C,; =I, UI. Repeat the construction for each of the intervals I,, and 
I,», thus removing the middle open third of each of the two intervals. This 
leaves four closed intervals: I, ; = [0,1/9], In. = [2/9, 1/3], L,3 = [2/3,7/9], and 
I, 4 =[8/9,1]. Define C, = Ui, ;. Repeating this construction yields, for every 
n&N, a sequence of closed intervals I,,1,...,In2», each of length 3~". Define 
C= Vela. 


The Cantor set is defined to be C= NP, C,,. 

It is clear that C is an infinite set because it contains the endpoints of each of the 
intervals I,,; for all n EN and all 1 <j < 2". What is less obvious is whether C 
contains any additional points. The surprising answer is that C is uncountable. 


First we establish some topological properties of C. 


Definition. A closed subset A ofa metric space X is said to be a perfect set if every 
point of A is a limit point of A. Thus A is perfect if it is equal to its derived set. 


Example 9. The closed unit interval [0,1] is a perfect set. @ 


Definition. A subset A of a metric space X is said to be nowhere dense if 
int(A) = @. 


Example 10. A hyperplane M in R" is nowhere dense in R”. 


Without loss of generality, assume that the hyperplane contains the origin. 
Thus there is a nonzero vector a such that M={x ER": a'x = 0}. For xe M, 
and € >0, the open ball B = B(x,¢) is not contained in M because the point 
x+— €B-M. 4 


2llal| 
Lemma 4.2.15. The Cantor set is closed, perfect, and nowhere dense. 
Proof. Since each C,, is closed, C = N71 C,, is closed. 
We show that C contains no open intervals. This proves that int(C) = @. Let J be an 


open interval of length € > 0, and choose an integer n such that 3~" < €. Since 
the length of each of the intervals I,, ; is 3~", none of the intervals I,, ; contains J. 
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Since J is connected,” it cannot be contained in the (disconnected) union of two or 
more of the intervals I,,;. Thus J is not contained in UE nj = C,,. Hence J is not 
contained in C. 


Finally, let x € C and consider the interval (x — €,x + €). Again choose an integer n 
such that 3~" <¢. Sincex € C,,x €I,,; for some 1 <j <n. Because the length of 
I,,jis3-" <6, 1,; € (x—€,x +6). Thus (x—€,x + €) intersects C at a point other 
than x (at least one of the endpoints of I,,; is not equal to x.) This shows that every 
point in C is a limit point of C; hence C is perfect. 


In the rest of this subsection and in the section exercises, we need to consider the 


ternary (base 3) expansions of points in [0,1]. Every point x € [0, 1] has a ternary 
representation x = ae . where each a; € {0,1,2}. The sum may well be finite, 
x= ae me and the ternary representation of a number may not be unique, but 


this point is of no immediate consequence. However, see the section exercises. 


Lemma 4.2.16. Ifx = pe = a; € {0,2}, then x is the left endpoint of an interval 
I,,; for some 1 <j<2"% 


Proof. We prove the result by induction on n. Whenn=1,x= = Ifa, =0,x =0, 
which is the left endpoint of 1,1. If a, = 2, x = 2/3, which is the left endpoint of I, >. 


Now suppose the statement is true for a certain integer n. Consider a number 
+1 4; ‘ ; 

y= ye a where a; € {0,2}. If a,4, =0, there is nothing to prove, so suppose 

An41 = 2. By the inductive hypothesis, the number x = ." 


i=] 


= is the left endpoint 
of an interval I, ; for some 1 <j < 2". Since y=x+ = y is the left endpoint of 
the right closed subinterval that results from the trisection of I,,;. Thus y is the left 
endpoint of an interval In4,,, forsome1<k<2"*'. Hl 


Proposition 4.2.17. Ify = pas > where a; € {0,2}, theny EC. 


Proof. It is enough to prove that y € C,, for every nEN. Let x= pa - By the 
previous lemma, x is the left endpoint of some interval I, ;. Since the length of I, 


oa co 2 = 
is 3 "and y—*S Dienst gg = 3 “yel,j CC, 


? To say that J is connected means that if x,y € J, and x < z< y, thenzeE/J. 

> Observe that the statement does not exclude the possibility that a,, = 0. This is because if a point 
x is the left endpoint of an interval Ink for some m and some 1 <k < 2”, then x is the left endpoint 
of I,,; for every n > mand some | <j < 2". The reason is that the successive trisections of I,,,, always 
result in an interval (the leftmost) whose left endpoint is x. 


118 FUNDAMENTALS OF MATHEMATICAL ANALYSIS 


We now need the binary (base 2) representations of numbers in the interval [0, 1]. 
In this system, every x € [0, 1] can be written as a series pan ~, where a; € {0, 1}. 


Again, such a representation may be finite; x = Ye = KG in this case, x does 
not have a unique representation. For example, the number 1 I: q can also be written 


as 1/4+1/8+1/16+ .... In general, the ener x= pe : 5 -, where a, =1 can 


‘ n-1 a; co 
also be written as x= ))_, at Pika a Hi order‘to asi ambiguity, we use 
the latter representation of x and not the ce sum representation. 


Theorem 4.2.18. The Cantor set has cardinality c. 
Proof. We define SHennen ye [0,1] > Cas Jolloves f(0) = 0, and, for x € (0,1), 
write x =~ eee - and define f(x) = ye bar *, By the previous proposition, f(x) € 


C. We leave it to ‘the reader to verify that f is one-to-one. Now lemma 2.2.3 implies 
that C is equivalent to [0,1]; hence Card(C) = Card((0,1]) = c. 


Exercises 


_ 


. Which of the following subsets of R? are open? 

(a) A= {(x, y) € R? : x #0,y < 1/x*} 
(b) B=U,o{(x, y) ER? : xER,y=a*} 

2. Find the closure of each of the sets A and B in the previous problem. 

. Let X be a metric space, and let x € X. Show that the set Blx,6] ={y EX : 
d(x, y) < 6} is closed in X. The set B[x, 6] is called the closed ball of radius 
6 centered at x. Give an example to show that the closure of the open ball 
B(x, 6) is not necessarily the closed ball B[x, 6]. 

4. Show that if X is a normed linear space, then the closure of the open ball 

B(x, 4) is the closed ball B[x, 6]. 

5. Let H ={x EP : |x,,| < 1/n}. Prove that K is closed. This set is known as 
the Hilbert cube. 

. Prove that a subset A of a metric space X is bounded if and only if it is 
contained in a ball. 

. Let (X,d) be a metric space, let AC X, and let x,y€X. Prove that 
dist(x, A) < d(x, y) + dist(y, A). 

. Let X,A, and x be as in the previous exercise. Prove that dist(x, A) = 
dist(x, A). 

. Let (X,d) be a metric space, and let A and B be nonempty subsets of X. 
Prove that 
(a) dist(A, B) = inf{dist(a, B) : a € A} = inf{dist(b, A) : b € B}, 

(b) dist(A,B) = dist(A,B), and 
(c) there are disjoint closed subsets E and F (of R) such that dist(E, F) = 0. 


Ow 


jon 


N 


ioe) 


\o 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 
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Let (X, d) be a metric space, and let A and B be subsets of X. Prove that 

(a) int(A N B) = int(A) Nn int(B), and 

(b) int(A U B) D int(A) U int(B), giving an example to show that the con- 
tainment may be proper. 

Let A and B be as in the previous exercise. Prove that 

(a) AUB=AUB, and 

(b) ANB C ANB, giving an example to show that the containment may be 
proper. 

Show that if a sequence (x,,) in a metric space X converges to x, then {x, ! 

n € N}U {x} is closed in X. 

Cantor-like sets. Let 0 < ¢ < 1. From the unit interval [0,1], remove the 

open subinterval of length ¢/3 centered at 1/2, leaving the two closed 

intervals I, , and I, ,. Then repeat the geometric construction of the Cantor 

set, except require that the removed open interval from I,,; be centered at 

the midpoint of Ing and have length e/ 3"t1 The resulting set, C,, is known 

as a Cantor-like set. Prove that C, is closed, perfect, and nowhere dense. 

Complete the proof of theorem 4.2.18. Hint: Modify the proof of theorem 

2.1.15 

Prove the converse of lemma 4.2.16. 

We now take a more careful look at the ternary representation of numbers 

in [0,1]. Specifically, we address the issue of the nonuniqueness of the 

Gj 


: . n 
representation of a finite sum x= ))_, if where a,, # 0. If a, = 2, we use 


: : —1a; 
the finite sum to represent x. If a, = 1, we use the series x = De eal 
i=1 3) 

co 2 : 
>, — and not the finite sum to represent x. 

i=n+1 33 
Prove the converse of proposition 4.2.17. Thus the Cantor set consists 
of exactly the points in [0,1] that have a ternary representation of the 
form x= ae = a; € {0,2}. Hint: Prove that if x = a 2 and any of the 
integers a; = 1, thenx EC. 
Prove that a number x € C is a right endpoint of an interval I, ; ifand only 
if the ternary representation of x contains a finite number of zeros. 
Prove that the interior of the standard n-simplex T,, consists of all the points 
in T,, with positive barycentric coordinates. Hence describe the boundary 
of T,,,. 


4.3 Continuity and Equivalent Metrics 


Continuity, from the intuitive point of view, is about the gradual rather than the 
abrupt change of function values. In its simplest form, the graph of a continuous, 
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real-valued function of a single real variable must be connected. Most functions in 
mathematics are too complicated for such a visual characterization of continuity, 
and a more rigorous and robust definition is needed. The ¢-6 definition of 
continuity revolutionized calculus, and hence mathematics, in the early nineteenth 
century. It is based on the idea that the fluctuations of a continuous function 
can be controlled in a sufficiently small neighborhood of a point of continuity. Our 
definition of local continuity in the metric setting is an immediate generalization 
of the €-6 definition. We then define the global continuity of a function on a metric 
space, an important concept seldom treated in undergraduate textbooks. You will 
see that continuity does not depend on the specific metric we use to measure 
proximity, but rather on the collection of open sets the metric induces. This leads 
us to the notion of equivalent metrics and, more generally, homeomorphisms. 


Definition. Let (X,d) and (Y, ¢) be metric spaces. A function f : X > Yis said to 


be continuous at a point x) € X if, for every € > 0, there exists d > 0 such that 
e(f(x), flxo)) < € whenever d(x, x9) < 6. 


The following theorem is an obvious restatement of the definition. 


Theorem 4.3.1. Let (X,d) and (Y,e) be metric spaces. A function f : X > Y is 
continuous at X, if and only if the inverse image of an open ball in Y centered 
at f(x) contains an open ball in X centered at xy. 


Theorem 4.3.2. Let f : (X,d) > (Y,¢). Then fis continuous at xy if and only if, for 
a sequence (x,) in X with lim,, x, = Xo, lim, f(x,) = f(xo). 


Proof. Suppose f is continuous at Xo, and let lim,x, =X9. Given €>0, there 
exists 6 > 0 such that e(f{x), flxy) < € whenever d(x,x9) < 6. Now there exists 
a natural number N such that, for n>N, d(x,,x)<6. Thus, for n>N, 
P(A xp) (Xo) <€, and lim, flx,) = f(xo). Conversely, if f is not continuous at 
X, there exists € >0 such that f—'(B(f(xo),€)) contains no open ball centered 
at Xo, and hence B(x, 1/n) — f~'(BUf(%o), €)) # @ for every n EN. Pick a point 
Xn © B(x, 1/n) — f7'(B(K(xo),€)). Clearly, lim, x, = xo, but lim, flx,) # (xo) 
because e(f(x,,),flXo)) = € for all n. 


Theorem 4.3.2 provides an extremely useful criterion for proving that a given 
function is continuous. It is called the sequential characterization of continuity. 
See examples 1 and 2 below. 


Definition. A function f from a metric space (X,d) to a metric space (Y,¢) is 
continuous on X if it is continuous at each point x € X. 
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Theorem 4.3.3. For a function f from a metric space (X,d) to a metric space (Y,), 
the following are equivalent. 


(a) fis continuous on X. 
(b) The inverse image of an open subset of Y is an open subset of X. 
(c) ‘The inverse image of a closed subset of Y is a closed subset of X. 


Proof. (a) implies (b). Let V be an open subset of Y, and let xy € f~!(V). Since V is 
open, there exists € > 0 such that B(f(xo),€) C V. Since fis continuous at xo, there 
exists 6 > 0 such that f(B(xo,6)) C B(f(xo),€). Thus f-'(V) 2 f-(B(f(%), €) 2 
B(x, 5). This proves that f~'(V) is open in X. 

(b) implies (c). Let F be a closed subset of Y. Then Y—F is open in Y. By 
assumption, f-'(Y — F) is open in X. But f-'(Y— F) = X— f7!(P) hence f~'(F) 
is closed in X. 

(c) implies (a). Let x9 € X and let V = B(f(xy),€); Y—V is closed in Y, so, by 
assumption, f~'(Y — V) = X—f7'(V) is closed in X, and hence f~'(V) is open. 
Because xy € f~'(V), there exists 6 such that B(x, 5) C f~!(V). By theorem 4.3.1, 
fis continuous at xo. @ 


Example 1 (the continuity of norms). Let (x,,) be a convergent sequence in a 
normed linear space, and suppose that lim, x,, = x. Then lim, ||x,|] = ||x||- This 
follows immediately from the fact that |||x,,|| — ||x||| < ||x, — xl]. # 


Example 2 (the continuity of inner products). Let (x,,) and (y,,) be convergent 
sequences in an inner product space with limits x and y, respectively. Then 
lim, (Xn.Vn) = (x,y). First recall that convergent sequences are bounded. Thus 
there is a constant M such that ||y,|| <M. Now 


(Xn ¥n) — GY) = [Xn — In) + GI — YP 
Sn — Yad + [Yn — Y)| 
S [ln — lll + [lelllyn — yl 
<M|\x,—xl| + |lxllllyn —yll] > 0 as n > 00. 


Definition. Let d, and d, be metrics on the same underlying set X. We say that d, 
is weaker (or coarser) than d, if every d,-open subset of X is d)-open. In this 
case, we also say that d, is stronger or finer than d,. 


Example 3. Let (X,d,) be any metric space, and let d, be the discrete metric on X. 
Clearly, d; is weaker than d,. We will give more interesting examples later. ¢ 
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Theorem 4.3.4. A metric d, is weaker that another metric dy if and only if the 
identity function Iy : (X,d,) — (X,d,) is continuous. 


Proof. If Ix : (X,d)) > (X,d,) is continuous and V is d,-open, then I'(V) is open 
in dy. But I,'(V) = V, so d, is weaker than d,. The converse is proved by reversing 
the above reasoning. @ 


Now we discuss concrete criteria that guarantee that a metric d, is weaker than d). 
Since every d,-open set is the union of d,-open balls, it suffices to show that every 
d,-open ball, By, (x,6) is d)-open. Since every y € By, (x, 6) is the center of a ball 
By, (y,9") © Ba, (x, 4), it is further sufficient to show that every open ball By, (y, 6’) 
contains a d,-open ball Bg,(y,¢) for some ¢ > 0. We apply the above strategy to 
prove the following theorem. 


Theorem 4.3.5. If there exists a real number a > 0 such that d,(x,y) < ad,(x, y) for 
all x,y € X, then d, is weaker than d). 


Proof. Consider a d,-open ball By (x, 6), and let ¢ = 6/«. It follows that By (x,€) C 
Bg, (x,6), because if y € By,(x,€), then d\(x,y) < ad,(x,y)<ae=0. 


Now we look at more significant examples of the concepts we developed. 


Example 4. It is clear that |’ CI since every absolutely convergent series is 
bounded. Thus the space X = I' has two metrics; the metric induced by the 1- 
norm ||.||, and that induced by the infinity norm |].||,,. Since, for x € I’, ||x||,. < 
||x||,, and d(x, y) = ||x—y|lo0 < |x — yl, = 4, (%, y), the infinity metric on X is 
weaker than the 1-metric on X. @ 


Example 5. Consider the space X = C[0,1] under the uniform metric and the 
1-metric. The identity function Iy : (X,||.||,6) > (X|].||,) is continuous since, 
for f € C[0, 1], [Ifll: < Ilflloo- By theorem 4.3.4, the 1-metric is weaker than the 
uniform metric. However, the identity function Iy : (X,||.|],) > Q% ||-||,9) is not 
continuous. To see this, consider the sequence (see section 3.6) 


2n3x if0<x<—, 
1 1 jae 
a 3 et pED Cf ohescace a 
fre) = 4-20? (x 5) if 3 <x< > 
0 if ~<x<1. 
n2 


fall: = i hence f, > 0 in the 1-norm, while f,, does not converge in the 


uniform norm since ||f;,||_ =n. @ 


THE METRIC TOPOLOGY 123 


Definition. Two metrics d, and d, on X are equivalent if they generate the same 
collection of open sets. Thus d, and d, are equivalent if d, is weaker than d, and 
d, is weaker than d,. 


Definition. A bijection f from a metric space (X,d) to a metric space (Y,¢) is 
bicontinuous if fand f~' are continuous. 


By theorem 4.3.4, the following result is immediate. 


Theorem 4.3.6. Two metrics d, and dy on a set X are equivalent if and only if the 
identity function Ix : (X,d,) > (X,d,) is bicontinuous. Hl 


Theorem 3.4.5 directly implies the following theorem. 


Theorem 4.3.7. If there exist positive constants a and 6 such that Bdy(x,y) < 
d,(x,y) < ad,(x,y) for every x,y € X, then d, and dy, are equivalent. 


The following theorem gives a sequential characterization of the equivalence of 
two metrics. 


Theorem 4.3.8. A necessary and sufficient condition for two metrics d, and dy on 
X to be equivalent is that a sequence (x,,) converges to x in d, if and only if it 
converges to x in dy. 


Proof. Suppose d, and d, are equivalent, and let lim,,x,, = x in d,. By theorem 4.3.6, 
Ix : (X,d,) > (X,d,) is continuous; hence, by the sequential characterization of 
continuity, Ix(x,) =x, converges to x in dy. We leave the rest of the proof to the 
reader. 


Example 6. Let X = R”. The metrics induced by the 1-norm, the 2-norm, and 
the co-norm are all equivalent. To see this, we use theorem 4.3.7. The reader 
should work out the details. A partial list of the inequalities needed includes 
[Ixll, S nllallco and |||]: < Vallalls. 


Example 7. Let (X,d) be a metric space. Then the metric d(x, y) = min{1, d(x, y)} 
is equivalent to d. It is a simple exercise to show that dis a metric. The fact that 
the two metrics are equivalent follows from B,(x,€) € Ba(x,¢€) and Ba(x,6) € 
B,(x,€), where 6 = mint{e, 1}. 


Remarks. 


1. Important properties of metric spaces are often determined by the collection 
of open sets and not by the specific metric that generates the open sets. For 
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example, a function f from a metric space (X,d) to a metric space (Y,() is 
continuous if and only if the inverse image of an open subset of Y is open 
in X. Clearly, if the metric d is replaced with an equivalent metric d,, then 
fis continuous with respect to d,. The same is true if ¢ is replaced with an 
equivalent metric. In many ways, the collection of open sets in a metric space 
is almost as intrinsic to the space as the specific metric that generates the 
open sets. 

2. Not all metric properties are preserved under metric equivalence. Observe 
that the metric din example 2 above is a bounded metric because d(x) = 1, 
even though the metric d may be unbounded. For example, the metric don 
IR is equivalent to the usual metric on R. In particular, boundedness is not 
preserved under metric equivalence. 


We include the following as another example of a bounded metric that is equivalent 
to an arbitrary metric. 


(x,y) 


Example 8. For an arbitrary metric space (X,d), the metric d(x, y)= er 
xy 


equivalent to d. 


We show that d satisfies the triangle inequality and leave the rest of the details 


for the reader to verify. The function f : [0,00) — [0,1) defined by f(#) = — is 


increasing. Thus if 0 < a< b, then ve < —. Replacing a with d(x, z), and b with 
d(x, y) + d(y,z), yields 


d(x, z) Z d(x, y) + d(y,z) 
1+d(x,z)~ 1+d(x,y)+d(y,z) 
d(x,y) fi dy, Z) 
1+d(x,y)+d(y,z) 1+d(x,y)+dy,z) 


A(x, y) dy,z) = 7 
< 1+dy) + 1+dy,2) = d(x,y) + d(y, z). ¢ 


d(x, Z)= 


Definition. Let (X,d) and (Y,e) be metric spaces. A function g : X > Y is said 
to be an isometry if, for every x,y € X, e(p(x), p(y)) = d(x, y). Notice that an 
isometry is always injective. We say that the metric spaces (X,d) and (Y, e) are 
isometric if there is a bijective isometry g : (X,d) > CY, ¢). 


Example 9 (linear isometries on R”). Let T be a linear isometry on R”. Then 
there exists an orthogonal matrix P such that T(x) = Px for every x ER". 
Observe that the converse was established in the exercises in section 3.7. 

Let P be the standard matrix of T. We prove that P is orthogonal. The 
assumption is that, for x,y ER", ||Px—Py|| = ||x—y||. In particular, taking 
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y = 0, we have ||Px|| = ||-||. First we claim that, for x,y € R", (Px, Py) = (x,y). 
This will conclude the proof because we then have 


(P™Px — x,y) = (P™Px,y) ~ (x) = (Px, Py) ~ (x,y) = 0. 
Choosing y = P’Px — x, we obtain (P?Px — x, P™Px — x) = 0, or 
(P™P—I,)x=0. 


Since x is arbitrary, PIp_—I n= 0. 

We now prove the claim. The assumption that ||Px — Py||? = ||x—y||? yields 
(Px — Py, Px — Py) = (x—y,x—y). Expanding the bilinear forms on the two 
sides of the last identity yields (Px, Py) = (x,y), as claimed. @ 


Isometric spaces are virtually identical except for the nature of the elements of the 
spaces X and Y and the definition of the metrics d and p. An isometry preserves 
all the metric properties of the space, including boundedness, which, as we saw, is 
not preserved under the equivalence of metrics. Another metric property that is 
preserved under isometries but not under metric equivalence is completeness. See 
section 4.6. 


Homeomorphisms 


The concept of a homeomorphism is of central importance in topology. In the met- 
ric setting, isometry, although quite useful, is too stringent and does not preclude 
homeomorphisms from being useful. One can loosely think of a homeomorphism 
as a relaxation of the concept of isometry and an extension of the notion of metric 
equivalence. 


Definition. Two metric spaces (X, d) and (Y,¢) are homeomorphic if there exists 
a bicontinuous bijection g from X to Y. The function ¢ is called a homeomor- 
phism from X to Y. 


Example 10. The open interval (—1, 1) is homeomorphic to R (both sets have the 
usual metric). The function f(t) = _ maps (—1, 1) bicontinuously onto R. 


Example 11. The closed upper half plane H = R x [0,co) is homeomorphic to 
the half-open strip A = R x [0, 1). To see this, define g : H > A by 9(x,y) = 
(x, ): It is a rather routine matter to verify that ¢ is a bijection and that its 

y 


inverse is p~'(x,t) = (x, —). 4 
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Example 12 (the stereographic projection). Let S! = {(¢,,6) ER? : +6 - 
€, = 0} be a circle of diameter 1 and centered at the point (0, 1/2), and let N= 
(0, 1) be the top point on the circle. Define the punctured circle to be the circle 
with the top point removed: Sj, = S' — {N}. We give S}. the Euclidean metric in 
the plane. Define a bijection P : Si, > R as follows: for a point € = (&),&) € Si, 
P(€) is the horizontal intercept of the line that contains the points N and &, as 
shown in figure 4.1. 


N = (0,1) 


E=P 1 (x) 


x =P (§) 


(0,0) 
Figure 4.1 The stereographic projection 


The mapping P is known as the stereographic projection of the punctured circle 
onto the real line. It is geometrically clear that P is a bijection and that it is 
bicontinuous: the inverse image of a bounded open interval in R is an open arc 
on S;,, and conversely. 


Explicit formulas exist for P and P7'. It is easier to derive the formula for P~! 
than to compute that for P. For a fixed x € R, the parametric equations of the 
line containing the points N and (x, 0) are €, = xt, =1—t,and—co <t<oo. 
Finding the intersection point € of the line and the circle yields the formula 
forP >? 


2: 
P~ (x) == (&,&) = (= 1 =). 


Inverting the above formulas, one obtains the following formula for the stereo- 
graphic projection: 


gi 
Tah, 
We define the chordal metric y(x,y) on R as follows: for two points x,y € R, 
X(x,y) is the length of the chord of the circle that joins the points P~!(x) and 
P~'(y), hence the name chordal metric. Note that y is the metric on R that 
makes the stereographic projection an isometry. Given the above formula for 


x= P(E) = 
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P7', a direct calculation of the Euclidean distance between P~!(x) and P~!(y) 
yields 


|x—y| 


VitxeV1+yp_ 


Because the stereographic projection is a homeomorphism, the chordal metric 
X is equivalent to the usual metric on R. We will see in section 4.6 that the 
chordal metric is not complete. This illustrates the fact that completeness is not 
preserved under homeomorphisms. The reader will recall that boundedness 
is not preserved under metric equivalence. Saying that two metrics d, and 
dy on a space X are equivalent is exactly the same as saying that the identity 
mapping Iy : (X,d,) — (X,d,) isa homeomorphism. The properties of a space 
that are preserved under homeomorphisms are called topological properties 
of the space. Compactness is the prime example of a topological property; 
see theorem 4.7.4. The fact that some metric properties, such as boundedness 
and completeness, fail to be hereditary under homeomorphisms is a rather 
inconvenient fact and does not diminish the usefulness of such properties. @ 


N(x y) = 


Example 13. Stereographic projections can be defined in all dimensions. Let 8" = 
{6 = (6,6, ...56.4,) ER: an &?_& ., =0} be the sphere in R"*! of 
diameter 1 and center (0,0, ...,0,1/2) € R"*!, and let N = (0,0, ...,0,1) € 8”. 
Define the punctured sphere S? = S” — {N}. The stereographic projection P of 
S” onto R" maps a point & = (&,&,, ...,€,4,) on the punctured sphere to the 
intersection of the hyperplane &,,,, = 0 and the line that contains the points 
€ and N. As in the one-dimensional case, it is easier to compute the formula 
for P~' than that for P. The calculations needed for computing the formulas 
for P and P7' are left as an exercise (see problem 18). The continuity of all the 
component functions shows that P is a homeomorphism. See theorem 4.4.6. 
One can also define the chordal metric on R" by x(x, y) = ||P7'(x) — P7!(y)|[2. 
See problem 19 in the section exercises. 


One important special case is when n= 2. This is relevant to the one-point 
compactification of the plane; see section 5.10. 


Exercises 


1. Let KK denote the real or complex field with the usual metric. Prove that iff 
and g are continuous functions from a metric space (X, d) to K, then so are 
the functions f+ g and fg. If, in addition, g(x) # 0 for all x € X, then f/g is 
continuous. 
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. Let fbe a continuous function from a metric space X to a metric space Y, 


and let g be a continuous function from Y to a metric space Z. Prove that 
the composition gof : X > Z is continuous. 


. Let fbe a continuous function from a metric space X to a metric space Y, 


and let A C X. Prove that the restriction of fto A is continuous when A is 
given the restricted metric. 

Fix an element a of a metric space X, and define a function f : X > R by 
f(x) = d(x, a). Prove that fis continuous. 

Let A bea fixed subset of a metric space X, and define a function f : X > R 
by f(x) = dist(x, A). Prove that fis continuous. 


. Let E be a closed subset of a metric space X, and let a€ X—E. Prove 


that there exists a continuous function f : X > R such that f(a) = 0, f(E) = 


1. Hint: Consider the function f(x) = a. 
d(x,a)+dist(x,E) 


provides an alternative proof of theorem 4.2.12. 


Show how this result 


. Let E and F be disjoint closed subsets of a metric space X. Prove that there 


exists a continuous function f : X > R such that f(E) = 0,f(F) = 1. Show 
how this result provides an alternative proof of theorem 4.2.13. 


. Let (X,d) be a metric space, and let E and F be disjoint closed subsets 


of X. Set U={x € X : dist(x, E) < dist(x, F)}, and V = {x € X : dist(x, F) < 
dist(x, E)}. Show that U and V are open sets that separate E and F. 


. Let fand g be continuous functions from a metric space X to a metric space 


Y. Prove that {x € X : f(x) # g(x)} is an open subset of X. 

Let fand g be continuous functions from a metric space X to a metric space 
Y, and let A be a subset of X such that f(x) = g(x) for every x € A. Prove that 
f(x) = Q(x) for all x E A. 

Prove that the metric d in example 8 is equivalent to d. 

Show that the converse of theorem 4.3.5 is false by finding a metric d, that 
is weaker that another metric d, but where exists no constant a > 0 such 
that d(x, y) < ad(x,y) for all x,y € X. 

Fix an element x) of a normed linear space X, and define a function ¢ : 
X > X by 9(x) = x + Xp. Show that ¢ is an isometry. 

Define a function g : X > X on a normed linear space X by g(x) = —x. 
Show that @ is an isometry. 

Show that the open unit disk U= {(x,y) € R? : x? +” < 1} is homeo- 
morphic to the plane R?. Hence show that the punched disk U—{(0,0)} 
is homeomorphic to the punctured plane R? — {(0,0)}. The same results 
extend to R”. 

Let f,g : R — R be continuous functions such that f(x) < g(x) for allx € R. 
Show that the region between the graphs, {(x,y) : x € R, f(x) < y < g(x}, is 
homeomorphic to the open strip R x (0, 1). 
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17. Let X be a normed linear space. Prove that any two open balls in X are 
homeomorphic. The same is true of closed balls. 

18. The parametric equations of the line containing the points (0,0, ...,1) and 
(x1,X2, ...,X,,0) in R"t! are 


f= (EixceiSpa yd = Ue gi. 1 = 0), 
Find the point of intersection of the line and the sphere, 
n+1 


£6 Gi ies Spa) = Dy & —€41 = 0}, 


to derive the formula for the inverse of the stereographic projection, 
P-1(xy,...,X,) = (&), ..-,& 41), where 


Ilell3 


1+ fled 


gi 


= — 1 <i<né,,= 
1+ |lx||3 i 


Hence, by inverting the above formulas, derive the formula for the stereo- 
graphic projection, 


gi 


P(E), «+s Eng) = (Xj, ---5X,), where x; = : 
lis ae 


19. Derive the formula for the chordal metric on R", 


Ilx=ylle 


1+ |Ixlloy/ 1 + Ib 


4.4 Product Spaces 


xy) = 


The Euclidean plane R?, as the product of two copies of R, is the simplest example 
of a product space. We saw in section 4.3 that the Euclidean metric in the plane, 
although the most natural, is equivalent to several other metrics, including the 
oo-metric, which, according to the definition below, is the product metric on 
IR’. It is only natural to expect that the product of two open intervals should 
be an open subset of R’, and the definition we adopt for the product metric 
smoothly guarantees that. When we identify the complex field with R?, the 
convergence of a complex sequence z,, = x,, + iy, is equivalent to the convergence 
of its real and imaginary parts in R, and one expects that product metrics in 
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general should extend this property. Not only does the product metric preserve the 
componentwise convergence in the factor spaces, it is characterized by it. You will 
see that the product metric is the weakest metric that guarantees componentwise 
convergence in the factor spaces. Additionally, we will show that the product 
metric admits the continuity of the projections on the factor spaces and, once 
again, is characterized by it. We therefore think of the product metric as the most 
economical metric that generalizes the properties of Euclidean space in relation 
to its factor spaces. 


Let {(X;,d,}i, be a finite set of metric spaces, and let X= Ls = 
{(x,,.-.,X,) 1 x; € X;} be the Cartesian product of the underlying sets X;. 


Definition. The product metric D on X is defined by 
D(x, y) = MAX <i<nd (Xi): 


Here x = (x), ...,x,) and (yj, ...,y,,) are points in X. The verification that D is a 
metric is straightforward. 


Example 1. For 1 <i<n, take X; = R, and let d; be the usual metric on R. The 
product metric D on IL- 4; = R" is exactly the co-metric on R”. @ 


For x € Xand 6 > 0, we denote the D-ball in X of radius 6 centered at x by Bp(x, 4). 
Theorem 4.4.1. Ifx € X and 6 > 0, then 
Bp(x, 6) = Bg, (x1,6) x ... X By (x). 
Proof. A point y= (J, ...,,) is in Bp(x, 0) if and only if max, <jcnAi(XjVj) < 6, if 
and only ifd;(x;,y;) < 6 for each 1 <i <n, if and only if y; € Dg,(x;,6) for each 


1 <i<n, ifand only ify € T],_, By (x6). 


Theorem 4.4.2. If U; is open in X; for each 1 <i <n, then the set U= gar U; is 
open in (X, D). 


Proof. Let x € ae U;. Then x; €U;, and hence there exists 6;>0 such that 
Ba (%j,.6)) C U;. Let d = min, <jcq6j. Clearly, []_, Ba (xi,6) © TT j_, Ba (xi, 6) © 
Ee; U;. By theorem 4.4.1, TI j_, Ba,(x;.6) = Bp(x,6). Thus x € Bp(x, 4) C U, 
which proves that U is open. @ 
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Remark. For a fixed 1 <i<n, let U; be an open subset of X;. As an immediate 
consequence of theorem 4.4.2, the set X, X ... X Xj_1 X U; X Xj4, X ... XX, is 
open in X. It follows that X—[X, x... x Xj_, X U;X Xj4, X... XX, ] =X, Xx 
we. XX;_1 X (X; — U;) X X41 X ... XX, is closed in X. 


Theorem 4.4.3. If F,,...,F,, are closed subsets of X,,...,X,, respectively, then 
F, Xx... X F,, is closed in X. 


Proof. Let U;=X;,—F, Then F=]]_,F,=[]._,(X%;- Uj) =n, X,x... x 
Xj_, X (XK; — Uj) X Xj, X ... x X,]. By the above remark, each of the sets 
XX... XXj_1 X (Xj — U;) X Xj4) X ... XX, is closed, and hence F is closed. @ 


Theorem 4.4.4. Suppose (X, D) = TL_,%; d,). Let (x), be a sequence in X, 
and let x = (x1, ...5%,) €X. Write x = (x, 1509), Then lim, x = x in D 
if and only if lim, x\ =x; in d; foreach <i<n. 


Proof. Because d(x, x,) < D(x, x), lim, D(x, x) = 0 implies that 
lim, d(x, x;) = 0. Conversely, if lim, x" = x; in d; for each 1 <i<n, then 


lim, max <iend,(xs, x) = 0, and hence lim, x = x. I 


Theorem 4.4.4 says that the convergence of a sequence in the product metric D 
is equivalent to the convergence of each of the component sequences (compo- 
nentwise convergence). In fact, componentwise convergence characterizes all the 
metrics on X that are equivalent to the product metric D, as the following theorem 
shows. 


Theorem 4.4.5. Suppose D* is a metric on the product space X where convergence 
in D* is equivalent to componentwise convergence. Then D* is equivalent to D. 


Proof. We use theorem 4.3.8. The metrics D and D* are equivalent if and only if 
convergence of a sequence in one metric occurs if and only if it occurs in the other 
metric. Clearly, this is the case for D and D*, since convergence in either metric is 
equivalent to componentwise convergence. 


Example 2. To illustrate the importance of the above theorem, note that each of 
the following metrics are equivalent to the product metric D on X: 


Di@y) = diy, 
i=1 


: 1/2 
Dixy) = (2 As. . 


i=] 


132 
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It is clear that convergence of a sequence in (X,D,) or (X,D,) occurs exactly 
when the component sequences converge. 


Therefore we can use either of the metrics D, or D, or any other metric 


equivalent to D as a definition of the product space, since they all generate the 


same collection of open sets. It is common to use whatever metric happens 


to be convenient in any particular situation. Theorem 4.4.5 also yields an 


equivalent metric for the product space if the metrics d,, ...,d,, are replaced 
with equivalent metrics. 


The following theorem is very useful in characterizing continuity of vector func- 


tions (functions into a product space). It says that a vector function is continuous 
exactly when its component functions are. 


Theorem 4.4.6. Let (Y,e) be a metric space, let f : Y > IL -, x» and write fly) = 
(fi), --..f,(”)). Then f is continuous if and only if each of the component 
functions f; : Y > X; is continuous. 


Proof. Let (y) be a sequence in Y, and suppose lim, y = y. By theorem 4.4.4, 
lim, fy) = fy) ifand only lim, fy) = fy) for alll <i<n. 


Example 3. We used the previous theorem to prove the continuity of the stereo- 


graphic projections. See example 12 on section 4.3, and problem 18 on the same 
section. 


Exercises 


. Let {(X;,d,)#_, be a finite set of metric spaces. Prove that X, x... x X,, is 


isometric to X, X (X, Xx... XX,). 


. Let {(X;, d,)}#, bea finite set of metric spaces, and let (X, D) be their product. 


(a) Prove directly, using the definition of D, that the projections 7; : X > X; 
are continuous. It follows that 7;'(U;) is open in X for every open subset 
U; of X;. 

(b) It then follows from part (a) that, for open subsets U; C X;, 1 <i<n, 
N77; '(U;) is open. What is N77) '(U;)? 


. When you have solved exercise 2 above, you will have an alternative proof of 


theorem 4.4.2. Do you see it? 


. Consider the metric d on a metric space X as a function on the product 


space X x X, endowed with the product metric. Prove that d: XxX—>R 
is continuous. 


. Let {(X;,d;)}_, be a finite set of metric spaces, and let X = ID. be the 


Cartesian product of the underlying sets. Prove that the product metric is the 
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weakest metric relative to which all the projections 77; are continuous. More 
explicitly stated, show that if D* is a metric on X and each 77; : (X,D*) > 
(X;,d;) is continuous, then the product metric is weaker that D*. 


4.5 Separable Spaces 


Although the rigorous definition of the real line was a giant leap in the devel- 
opment of mathematics, it would not be nearly as useful an invention had it not 
been for the fact that it contains the rational numbers as a dense subset. Indeed, 
all practical computations, including machine calculations, are done exclusively 
using rational numbers. The simplicity of rational numbers is enhanced by their 
countability. Thus Q is numerous enough, simple enough, but not too enormous to 
be a useful approximation of R. It is a reasonable quest to study metric spaces 
that contain a countable dense subset (of simpler elements). Such spaces are, 
by definition, separable. You will see that many (but not all) metric spaces are 
separable. The classical example is the space C[0,1]. It is well known that (see 
section 4.8) the set of polynomials with rational coefficients, which is countable, 
is dense in C[0,1]. What can be a nicer approximation of a continuous function 
than a rational polynomial! Separability ofa metric space turns out to be equivalent 
to the existence of a countable collection of open sets that generate all open sets, 
which is an added benefit and an important characterization of separability. 


Definition. A subset A of a metric space X is dense in X if A = X. By theorem 
4.2.5, A is dense in X if and only if every point in X is the limit of a sequence in 
A. Equivalently, A is dense in X if and only if for every x € X and every € > 0, 
there is an element a € A such that d(x, a) <e. 


Example 1. Given a function fe C[0,1] and a number €> 0, there exists a 
continuous, piecewise linear function g such that ||f— gl]. <€. 


We use the uniform continuity of f (see example 8 on section 1.2). Let d > 0 
be such that |f(x) — f(y)| < € whenever |x — y| < 6. Choose a natural number n 
such that 1/n <6, and, for 0 <j <n, let x; = j/n. Define the function g to be 
the continuous, piecewise linear function such that g(x;) = flx;) for 0<j <n. 
By construction, ||f—g||,. < €. Observe that this example says that the space of 
continuous, piecewise linear functions is dense in C[0, 1]. @ 


Definition. A metric space is separable if it contains a countable dense subset. 


Example 2. Since Q is dense in R, R is separable. More generally, Q” is countable 
and dense in R”; hence R” is separable. 
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Example 3. The metric space /! is separable. Let A = {(41, 49, ...,4,,0,0,...) 10 € 
N, a; € Q} be the subset of /' that contains all of the sequences with finitely 
many nonzero rational terms. As A is countable, we need to show that it is 
dense in ['. Let x = (x,) €I', and let € > 0. Since eer |x;| is convergent, there 
exists N € N such that an |x;| < €/2. For 1 <i < N, choose a; € Q such that 

N 
|x; — a;| < €/(2N), and let a = (aj, ...,4,,0,0, ....). Now ||x— al]; = ))_, |i - 
[o-o} € 
ai| + Dashed \x;| gS ere +e/2 =€. 4 


Example 4. The space c of convergent sequences is separable. 
Let A = {(a), 4), ...,4,,4,4,a,...): nEN,a; € Q,a € Q} be the set of ratio- 
nal sequences that are eventually constant. We show that A is dense in c. Let 
x = (x,) be a convergent sequence, and let € = lim,,x,. For € > 0, there is an 
integer N such that, for n > N, |x, — | < ¢/2. For 1 <i< N, choose a; € Q such 
that |x; — a;| < ¢, and then choose a rational number a such that |& — a| < ¢/2. 
Finally, set y = (a),...,@,,4,a, ...). By construction, ||x—y||,,<¢.@ 


Definition. A collection 8 of open subsets of a metric space X is said to be an 
open base for X if every open subset of X is the union of members of 8B. 


Example 5. The collection of open sets 8 = {B(x,r) : x € Q",r € Q} of open balls 
in R” that have rational centers and rational radii is an open base for the 
Euclidean metric on R”. This takes some verification, and we urge the reader to 
work out the details. @ 


Definition. A metric space X is second countable if it has a countable open base. 


The above example shows that R” is second countable because the collection 8 
is countable. 


Definition. A collection of open subsets U = {Uz}~e, of a metric space X covers 
X if UgerUq = X. We also say that {U,,} is an open cover of X. A subset of U 


that also covers X is said to be a subcover of U. 


Example 6. The collection Uf = {(—n,n) : k € N}is an open cover of R. The subset 
{(—2n, 2n) : n E N}is a subcover of U. 


Definition. A metric space X is said to be a Lindel6f space if every open cover of 
X contains a countable subcover of X. 


Theorem 4.5.1. The following are equivalent for a metric space X 
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(a) X is separable. 
(b) X is second countable. 
(c) X is Lindelof. 


Proof. (a) implies (b). Let A = {a,, ay, ...} be a countable dense subset of X. We claim 
that the countable collection 8 = {B(a,,r) : n E N,r € Q} is an open base for X. 
To prove that every open subset of X is the union of members of B, it is sufficient to 
show that ifx € X and 6 > 0, there exist an element a, € A and a rational number 
r such that x € B(a,,r) C B(x,6). Pick an element a, € A such that d(x,a,) < 
6/4, and choose a rational number r such that 6/4 <r < 6/2. Then x € B(a,,1), 
and if y € B(a,,r), then d(x,y) < d(x,a,) + d(a,,y) < 6/4+1r <6, so B(a,,1r) © 
B(x, 6). 


(b) implies (c). Let B ={B,, : n EN} be a countable open base for X. Suppose, 
for some collection U = {U, : a € I} of open subsets of X, X = Uge;Uy. For each 
natural number n, pick an element V,, in U that contains B,,. If no element of 
U contains B,, define V,, = @. We claim that {Vibnen covers X. If x € X, then 
x € U, for some a € I. There exists B,, such that x € B, C Ug; thus, V,, # @ and 
xEV,,. 


(c) implies (a). For a fixedn € N,X = U,exB(x, 1/n). By assumption, there exists a 
Set {Xn,1)Xn,2---} such that X = U2 Bx,,;,1/n). We claim that {x,,; : n,j € N} is 
dense in X. Let x € X and let 6 > 0. Choose n EN such that 1/n < 6. Because 
xE Ue Bx y,j, 1/n),x€ BOXy,j, 1/n) for some j EN. Now d(x,,j,x) < 1/n <6, 
and the proof is complete. 


The following example shows that a separable metric space is, in a way, not too 
large. 


Example 7. The cardinality of a separable metric space is, at most, c. 

Let A = {a, : n © N} bea countable dense subset of a separable metric space 
X. For each x € X, define a real sequence S,, = (d(x, a,), d(x, a), d(x, a3), ....) We 
prove that the function x S, is an injection from X to the space R‘ of real 
sequences. Let x and y be distinct elements of X, and let r = d(x, y). Since A is 
dense in X, there exists an element a,, of A such that d(x, a,,) < 1/3. It is easy to 
see that d(y,a,,) > 21/3. In particular, d(x,a,,) # d(y,a,,); thus S is an injection. 
It follows that Card(X) < Card(R‘) = c% =c¢. @ 


Exercise 


1. Show that P is separable for all 1 < p < oo. 
2. Show that the space cg of null sequences is separable. 
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3. Prove that a subset A of a metric space X is dense if and only if it intersects 
every nonempty open subset of X. 

4. Prove that if X is separable, then any collection of pairwise disjoint open 
subsets of X is countable. Conclude that J is not separable. See problem 10 
on section 4.1. 

5. Show that a normed linear space X is separable if and only if the unit ball 
{x € X : ||x|| < 1} is separable. 

6. Show that a subspace of a second countable space is second countable. 

7. Show that a subspace of a separable space is separable and that a subspace 
of a Lindel6of space is Lindel6f. 

8. Show that the product of finitely many separable metric spaces is separable. 

9. Let fbe a function from a metric space X to a metric space Y, and let B be 
an open base for Y. Prove that fis continuous if and only if f~'(V) is open 
for every VE B. 

10. Prove that every open subset V of R is the countable union of disjoint open 
intervals. Hint: For x € V, let a, = inf{t © V : (t,x] € V}, and b, = sup{t € 
V : [x,£) C V}. Show that I, = (a,,b,) € V, then prove that, for x,y € V, 
either I, = I, or L, NI, = ©. 


4.6 Completeness 


The mathematicians of antiquity had a clear understanding of the existence 
of irrational numbers, and mathematicians through the ages understood that 
irrational numbers are gaps inside the rational number field. Thus it was quite 
well understood that the rational field is not complete. It took some twenty- 
four centuries for a rigorous definition of the real number field as a complete 
ordered field to materialize. The definitions and some of the results in this section 
parallel those in section 1.2. For example, the proof of the Bolzano- Weierstrass 
property of bounded sets (theorem 1.2.10) includes a proof of the nested interval 
theorem, which is a very special case of the Cantor intersection theorem. Another 
highlight of this section is Baire’s theorem, which is one of the cornerstones upon 
which functional analysis is built. We will establish the completeness of the ? 
spaces as well as the function space C[a, b], which will pave the way for a number 
of interesting applications begun in the section and continued in the section 
exercises. 


Definition. A sequence (x,,) in a metric space X is said to be a Cauchy sequence 
if, for every € > 0, there is a natural number N such that, 


for all m,n > N, d(X,,Xm) <€- 
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Theorem 4.6.1. A convergent sequence is a Cauchy sequence. 


Proof. Let limx, = x, and let € > 0. There exists a natural number N such that, for 
n> N, d(x,,x) <€/2. Now, form,n > N,d(Xp,Xm) < dXn,x) + d(x, X%) <¢. 


Theorem 4.6.2. A Cauchy sequence is bounded. 


Proof. Let €=1. There exists a positive integer N such that, for m,n>N, 
A(X 5X) < 1. In particular, d(x,,xy) < 1 for alln > N. Therefore, for alln EN, 
A(X, Xy) < max{1, d(x,,Xy), oe) A(xy_1,Xy)}. a 


Theorem 4.6.3. Ifa Cauchy sequence (x,) contains a subsequence x, that converges 
to x, then (x,) converges to x. 


Proof. Let € > 0. There exists a natural number N such that, for m,n > N, 
A(X» Xm) < €/2. Since lim, xy, = x, there exists an integer K such that, fork > K, 
A(x;,.x) <€/2. Without loss of generality, we may assume that K>N and 
thus ny =K>N. Taking m= nx and using the triangle inequality, for n> N, 
A(Xy,X) S dA Xp Xn) + A(Xppr x) <€. 

Definition. A metric space X is complete if every Cauchy sequence in X converges 
to a point in X. 


Before we look at major examples of complete spaces, we look at an example of 
an incomplete one. 


Example 1. Consider the chordal metric v on R. We will show that although the 
sequence x,, =n is a Cauchy sequence in (R, 7), it does not converge. 


| 


|n — m| eee 


Vitmylane ieee 


x(n, m) = ————— < |= — =| 0as m,n oo. 


To prove that the sequence does not converge to any x € R, we observe that 


ls 


i ee " fisiviee =V1+x | MToe 


#0.4 


Theorem 4.6.4. 
(a) A closed subspace A of a complete metric space is complete. 
(b) A complete subspace A of a metric space is closed. 
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Proof. (a) Let (x,,) be a Cauchy sequence in A. Since X is complete, there exists x € X 
such that lim, x, = x. Since A is closed, theorem 4.2.5 guarantees that x € A. 


(b) Let x € A. By theorem 4.2.5, there exists a sequence (x,) in A such that 
lim, x, =x. Now (x,) is Cauchy (theorem 4.6.1), and A is complete, so (x,) 
converges to a point y in A. By the uniqueness of limits, x = y. Ml 


Example 2. The space c of convergent sequences is complete. We show that c 
is complete by showing that it is closed in I. See problem 2 in the section 
exercises. Let x = (x,,) € [© be a closure point of c. We show that x is convergent 
by showing that it is Cauchy. Let € > 0, and choose a convergent sequence 
y = (y,) such that ||x— y|],. < €. Because (y,,) is Cauchy, there exists an integer 
N such that, for n,m > N, |¥n —Vm| < €. Now if n,m > N, then 


[Xn — Xml = Xn — Yn + Yn — Yl + Yn — Xn| < 36.4 
Theorem 4.6.5. The sequence spaces I? are complete for 1 < p < o. 


Proof. We leave the proof of the case p= co to the reader (also see theorem 
4.8.1). Fix 1<p< oo, let (x,) be a Cauchy sequence in P, and write x, = 
(Xn1sXnj29 «+9 Xn,ko +++). Given € > 0, there exists NEN such that, for n,m > 
N, ||Xn —Xmllb = ae [Xnk—Xm,kl? < €?. In particular, if k is a fixed positive 
integer, then, for every n,m >N, |Xnp—Xmz| <€. Thus (%n4)ae1 is a Cauchy 
sequence in IK. By the completeness of K, x, =lim,,x,,, exists for every KEN. 
Set x = (x;,)g21. We will show that x € P and that lim,,||x, —x||, =0. For an 
arbitrary positive integer K, 


K 


K co 
Dy lel? = lim >) lxp,el? S lim sup D) |x,,/ = lim sup ||>x,[p- 
k=1 " k=l n k=1 n 


Because (x,,) is Cauchy, ||x,||p is bounded by theorem 4.6.2; hence limsup,, Ilx,llp < 
oo. This shows aa |x,|P < 00, and hence x € IP. 


Finally we show that ||x,, — x||) > 0, as n > oo. For arbitrary positive integers n 
and K, 


mo 


s . K ‘ p 
> nk — x;l? = dim Diet nk — Xm xl? S lim sup ||x, —Xmllp- 
k=1 


Taking the limit as K — oo of the extreme left side of the above string of inequali- 
ties, we have 


foe} 


: Pp 
Wlxn — 1p = D5 Xne — xeI? Slim sup | [xp — Xp llp- 


k=1 m>o 
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Observe that the above inequalities hold for an arbitrary positive integer n. Now 
let € > 0. There exists NEN such that, for m,n > N,||xXn —Xmllp <€. Thus, for 
n>N, limsup,,,_, 5 ll¥n —Xmllp < €. We have shown that, for n > N, ||x,— |p S 
€. This completes the proof. 


Theorem 4.6.6 (the Cantor intersection theorem). Suppose X is a complete metric 
space. If {F,,}"°_, is a descending sequence of closed nonempty subsets of X such that 
lim, diam(F,,) = 0, then N&,F., is a one-point set. 


Proof. For every nEN, choose a point x, € F,. Let € > 0. There exists a natural 
number N such that, forn > N,diam(F,,) < €. Now if m>n>N, then F,, 2 Fry, 
and XX, © F,, and hence d(x,,,X,,) < €. This makes (x,,) a Cauchy sequence and 
hence convergent to, say, x. Each of the sets F,, contains all but a finite number 
of terms of (x;). Since each F,, is closed, x € F,, for all n, and x € N?_, F,,. Now 
diam(N&,F,,) < diam(F,,) > 0. Hence NP, F,, = {x}. 


Definition. A subset A of a metric space X is nowhere dense if int(A) = @. 


Example 3. Nis nowhere dense in R. The reader is cautioned that int(A) # int(A); 
for example, int(Q) = @, while int(Q)=R. 


Theorem 4.6.7 (Baire’s theorem). A complete metric space cannot be expressed as 
a countable union of nowhere dense subsets. 


Proof. Let {A,,} be a countable family of nowhere dense subsets of X. Without loss 
of generality, assume that each A,, is closed. Since X — A, is open and nonempty, 
there exists a ball B, = B(x,,6,) such that B,} NA, = ©. By reducing the radius 
6,, if necessary, we may assume that 6, < 1 and that B, NA, = ©. Since B, — A, 
is open and nonempty, we can find a ball B, = B(x,,6,) such that B,N A, = ©. 
As before, we may assume that 5, < 1/2 and B,N.A, = ©. We can continue this 
process and construct a sequence of balls {B,,} such that B, NA, = @, By 2B, 2 
..., and diam(B,,) < 2/n. By the Cantor intersection theorem, NB, = {x}. Since 
B, NA, = @,x € A, foralln EN, and U7, A, #X. 


The following two results are powerful consequences of Baire’s theorem. 
Theorem 4.6.8. Let {A,,} be a countable family of closed nowhere dense subsets of 
a complete metric space X, and let Uy be a nonempty open subset of X. Then 


Up = Ur 1An # O. 


Proof. Modify the proof of Baire’s theorem by requiring that B, C U) — A,. 1 
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This result generalizes Baire’s theorem: not only is the set A = U7, A,, not equal 
to X, but A, in fact, has an empty interior. 


Theorem 4.6.9. If {U,,} is a countable collection of open dense subsets of a complete 
metric space, then Nee, U,, is dense. 


Proof. If there is a nonempty open subset Uy such that Uy NNR, U,, = ©, then Uy © 
X—NP,U, = US (X— U,,). This contradicts the previous theorem because each 
X—U,, is closed and nowhere dense. See problem 10 at the end of the section. 


Theorem 4.6.10. The product of a finite number {(X;,d,)}_, of complete metric 
spaces is complete. 


Proof. Since X, X ... XX, is isometric to X, X (X) X ... X X,), it is enough to show 
that the product of two complete metric spaces, (X,,d,) and (X, dy), is complete. 
Let (x) be a Cauchy sequence in XXX, and write x = (x, x), Since 
dias”, x) < D(x, x), each of the sequences (x) is Cauchy. Therefore, for 


(k) 


i=1,2, lim,x;” = x; exists. Clearly, limp x = (x), x). 


Before we embark on the application subsection, we prove the following result. 
Theorem 4.6.11. The space (C[a, b],||.||,.) is complete. 


Proof. Let (f,,) be a Cauchy sequence in C[a, b]. For € > 0, there is a positive integer 
N such that ||f, —finlloo < € for every m,n >N. Thus, for every x € [a,b] and 
every m,n > N, |f,(x) —f,,(%)| < €; hence (f,(x)) is a Cauchy sequence for every 
x € [a,b]. By the completeness of \K, f(x) = lim,,f,,(x) exists for every x. 


We claim that lim, ||f, —flloo = 0. Let € and N be as in the previous paragraph. 
Then |f,(x) —fin(x)| < € for every x € [a,b] and every n,m > N. Taking the limit 
as m — oo, we obtain |f,(x) — flx)| < € for every x € [a,b] and every n > N. This 
means that ||f, —flloo <€ as claimed. 


Finally, we need to show that f is continuous. Suppose that x, € [a,b] and that 
lim; x, = x. Let € > 0. By the previous paragraph, there is an integer N such that 
ILfv —flloo < €. By the continuity of fy at x, there exists an integer K such that, for 
k>K, |f(xp) — fr) < €. Now, for k > K, 


fx) — fla S OD) — fyrQOl + livCd — fxn)! + bfivOa) —flxx)| < 3¢. 
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Example 4 (the Weierstrass M-test). Let f, be a sequence in C[a, b], and suppose 
that there exists a real sequence (M,,) such that, for every n EN, |lfrlloo < Mn 


and ae M,, < oo. Then the series of functions wes f(x) converges in C[a, b]. 


We prove that the sequence of partial sums S,,(x) = ye f(x) is a Cauchy 
sequence in C[a,b]. Let ¢>0. By the convergence of the positive series 
ya Ms there is an integer N such that, for m>n>N, ya 4 oe 
Thus, for m > n > N, and for every x € [a,b], |S,,(x) — S,()| < nee [fi()| < 
piers M; < €, or ||S,, — S,||oo < €. This shows that S,, isa Cauchy sequence and 
hence, by the completeness of C[a, b], is convergent to a function f € C[a, b]. @ 


In fact, the series ye Jn(x) converges to fabsolutely as well as uniformly to fon 


[a, b]. 


Example 5. If aa sequence f,, converges to fin C[a, b], then 


b b 
lim [ fr(xdx = | fixddx. 


In particular, if the series ~ g,(x) converges in C[a, b], then 


b © oe) b 
> &n(x)dx = oF Snlx)dx. 


a n=l n=1/a 


Let € > 0. There exists an integer n such that for n > Nand all x € [a,b], |f,(«) -— 
fix)| < €. Now if n> N, then 


b b b 
if Fal(x)dx -{ Kxdx| < ‘) nlx) —f(x)|dx < e(b—a). 
Applications of Completeness, Part 1: Contraction Mappings and Applications 
In this application we prove the contraction mapping theorem,” which is one of 


the simplest fixed point theorems. Then we apply it to derive the existence and 
uniqueness of solutions of certain types of differential and integral equations. 


* The sequence of partial sums Hin M, is Cauchy because its limit, yes M,,; is convergent. 
° Also called Banach’s fixed point theorem. 
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Definition. Let X be a metric space. A function T : X — X is called a contraction 
if there exists a constant 0 < k < 1 such that, 


for all x,y € X, d(T(x), T(y)) < kd(x, y). 


Theorem 4.6.12 (contraction mapping theorem). Let T : X > X be acontraction 
on a complete metric space X. Then T has a unique fixed point. Thus there is a 
unique point z in X such that T(z) = z. 


Proof. Let xo be an arbitrary point in X, and define a sequence (x,) in X by 
Xn41 = T(X;). First we show that (x,) is a Cauchy sequence. 


For n > 1,d(x,41,X,) = d(T(x,), TX,_1)) < kd(x,,,X,_1), and, by induction, 
A(X y415Xp) < k"d(x1, x9). Now, for m,n EN with m <n, 


A(XnsXm) < A(Xp5Xn—1) + A(Xp—1,Xn—2) Tee A Xing Xm) 
<(k™1 4k? + FRM d(x1,%9) = kM EA. ERO d(x, x9) 
oo Km 
< km 5) kid(x;,%0) = [op *0). 


j=0 


Since lim,,k” =0,(x,) is a Cauchy sequence. By the completeness of X, z= 
lim,, x, exists. We claim that z is the unique fixed point of T. 


Now z= lim, x, = lim, T(x,_,) = Tdim,x,_,) = T(z). To show that z is unique, 
suppose w is a fixed point of T. Then d(z,w) = d(T(z), T(w)) < kd(z,w). This 


would be a contradiction unless d(z, w) = 0, that is, z= w. 


Definition. A function f : [a,b] xR — R is said to be a Lipschitz function in its 
second variable if there is a constant L > 0 such that 


Kx, y) —flx,2)| < Ly — 2, 
for all x € [a,b] and all y,zER. 


Theorem 4.6.13. Consider the initial value problem 


dy x = 
= x.) A) = Yo 


Suppose that f : [a,b] xR > R is continuous and that it satisfies the Lipschitz 
condition in its second argument: |f(x,y) —f(x,z)| <L|y—2z|. Then the initial 
value problem has a unique solution y(x) on the interval [a, b]. 
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Proof. Choose a constant K > L, and define a metric on X = C[a, b] by 
dly,2) = sup eta,o}exP{—K(x— a) }Iy(x) — 2%). 
It is relatively easy to check that d is a complete metric on X. Note that, for all 


x € [a,b], and all y,z € X,|y(x) — z(x)| < eC) dy, Z). The initial value prob- 
lem in question is equivalent to the integral equation 


W(x) = Yo +f Kis, W(s))ds. 


Define a function T : X > X by y+ T,, where T,(x) = yo + SI” fls,y(s))ds. The 
proof will be complete if we show that T has a unique fixed point y € X. For two 
functions y,z € X, 


ee OT (x) — T,(x)| = e~K-9) 


[ Ks) fis, As) 
Sie Kr#) i} f(s, y(s)) — fis, 2(s))|ds < Le“KO-) i) ly(s) — 2(s)|ds 
< Lek dy, 2 | eK 4) ds = Fe Kody, z[eke-9 — 1] 
E .-K(x-a) K(x-a) = E = ae 

< Ke d(y, ze = 4: Z) = kd(y,z), where k = K <i. 
The above inequalities show that d(T,, T,) < kd(y,z); hence T is a contraction. We 
now invoke the contraction mapping theorem to conclude that the initial value 
problem has a unique solution in C[a, b]. 

Theorem 4.6.14. If X is a complete metric space and T : X — X is such that T” is 
a contraction for some positive integer n, then the unique fixed point of T” is the 
unique fixed point of T. 

Proof. Let x be the unique fixed point of T". Thus T"(x) = x. Now T(x) = T"t!(x) = 
T"(T(x)). Thus T(x) is a fixed point of T". But the fixed point ot T” is unique, so 


T(x) =x. We leave it to the reader to show that x is the only fixed point of T. Mi 


Theorem 4.6.15. Consider the nonlinear Volterra equation 


u(x) = i K(x, y, u(y) dy + fix). 
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Suppose f € Cla,b] and that K is continuous on [a,b] x [a,b] x (—0co0, 00) and 
satisfies the Lipschitz condition |K(x,y,z,) — K(x,y,22)| <L|z,—z,| for all 
x,y € [a,b] and all z,,z, ER. Then the above integral equation has a unique 


solution u € C[a, b]. 


Let X = C[a, b], equipped with the uniform metric. Define a function T : X > X 
by 


ur T,,, where T,(x) = f(x) + i K(x, y,u(y))dy. 
We leave it to the reader to verify that T,, € C[a, b]. For u,v € X, 
|T,(x) — T,@)| < L||u— vI|o(% — 4). 
One more application of T yields 
[Ti(x) — T7@)| S$ L?||u— v]lo(x— a)?/2, 
and, by induction, 
[TH2) — TH2)| < Ju vole — a)" < “luv hag(b— a 


L"(b—a)" 


For sufficiently large n, k = <1, and, for such an n, T” is a contraction. 


! 
By theorem 4.6.14, T has a unique fixed point, and the Volterra equation has a 


unique solution. 


Applications of Completeness Part 2: Continuous, Nowhere Differentiable 
Functions 


The first example of a continuous, nowhere differentiable function was pro- 
duced by Weierstrass in 1872. Until that time, it was generally believed that 
continuous functions could fail to be differentiable at an isolated set of points. The 
main result in this application establishes an extreme contrast to the Weierstrass 
polynomial approximation theorem. Like polynomials, which are simple, well- 
behaved functions, the very erratic continuous, nowhere differentiable functions 
are also dense in @[0, 1]. 


Definition. For a fixed integer n > 1, let %,, be the set of functions fe C[0, 1] 
for which there is a point x) € [0,1] such that, for all x € [0,1], f(x) —f{x)| < 
n|x — xo|. 
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The geometric meaning of the condition in the above definition is that the slope 
of the line joining the point (x9, f(xo)) and an arbitrary point (x, f(x)) on the graph 
of f cannot exceed n in absolute value. There is no shortage of functions in %,,. For 
example, the functions f(x) = ax are in %, if |a| <n. 


Example 6. If f € C[0, 1] has a continuous derivative and ||f||,. <n, then fe %,,. 
Fix a point x9 € [0,1]. For any x € [0,1], |x) —f(xo)| = If (Ox — xo) | < nlx - 
Xo|. Here & is a point between x and x. @ 


The assumptions of the previous example are much stronger than they need to 
be. The differentiablility of f at a single point in (0,1) is enough to guarantee that 
fE®, for some n, as the following example illustrates. 


Example 7. If f€ C[0,1] is differentiable at x) € (0,1), then fe %, for some 
n>l. 
By assumption, there exists a number 6>0 such that, for |x—x)| <6, 


[RY — #(x)| <1. Thus, for [x29] <6, (fx) — fl) $ (I Go) + be 


ol. If |x — xo] 2 4, then f(x) — f(%o)| < 2flleo = “BES < HS |x — x9). Now, 
for any integer n > max{|f (xo)| + 1, 2\|fl| 9/5}, and all x © [a,b], |Kx) —flxo)| < 
n|x —X|. 


Remark 1. A direct consequence of example 7 is that if fe C[0,1] is not in any 
@w then fis nowhere differentiable in (0, 1). To prove the existence of a single 
nowhere differentiable function, we need to show that C[0,1] —U?,%, # ©. 
Theorem 4.6.9 provides the plan of attack if we wish to prove that there is an 
abundance of continuous nowhere differentiable functions: Prove that each §,, 
is closed and nowhere dense. Then U,, = C[0,1]—%,, is open and dense in 
C[0,1], and hence nf, U,, is also dense. The set N7L,U,, consists entirely of 
nowhere differentiable functions. 


Lemma 4.6.16. The set %,, is closed. 


Proof. Let (f,) be a sequence in &,,, and suppose f, > fin the uniform norm. For each 
KEN, let x, be such that |f,(x) — f.(x;,)| < n|x — x;|. By the Bolzano- Weierstrass 
theorem (theorem 1.2.8), (x,) contains a convergent subsequence Xk, For simplic- 
ity of notation, write x, for x, and f, for fy,, and let x» = limp... Xp. Forx € [0, 1], 
f(x) —f{xo)| = lim, [f,() — fp(xp)| < nlim, |x — x,| = n|x —xo|. 


We need to construct continuous functions that change direction steeply and 
frequently. Continuous, piecewise linear functions of this type exist. A continuous, 
piecewise linear function f has one-sided derivatives at each x in [0,1]. We denote 
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the right and left derivatives of f by Dt f(x) and D7f(x), respectively. We use 
the notation |Df| to denote the minimum (absolute) value of the (one-sided) 
derivatives of f Simply put, |Df] is the minimum absolute value of the slope of 
any straight line segment of the graph of f. Figure 4.2 shows the graph of the type 
of functions of interest to us. In that graph, the slope of any straight line segment 
of the graph is +4, and |Dy)| = 


Example 8. Given an arbitrary interval [a,b] and an integer k > 1, there exists a 
continuous, piecewise linear function ~ on [a,b] such that (a) = 0 = (0b), 
tlle = 2 and [Dy] > 2¢. 


Choose an integer m such that 2.4” > 4(b —a), and divide the interval [a, b] 
into 4” subintervals of equal length. For 0 <j < 4”, let xj=at a Define p 
to be the continuous, piecewise linear function such that, for0 <j < 4”, P(x) = 
0, and, for0<j<4"-1,y(+ ae jo2-*, The ris of the slope of any 
straight line segment of the a of p is equal to 5 a 77 24 


2 4m 


The idea behind the construction in example 8 is geometrically simple. If we want 
the short function yp (its height is 2~*) to have very steep slopes, we must make the 
base of each triangle very small, about 4~*. Then the slopes would have magnitudes 
2-*/4-* = 2*, Figure 4.2 depicts the function ~ on [0,1] withk =1=m 


0.5 


1/4 1/2 3/2 1 


Figure 4.2 The short, narrow, but spiky function pb 
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Remark 2. The function p constructed in example 8 is not in §,, for any n < 2. 
For any point (x, tb(xo)) on the graph of y, there is another point (x, ~(x)) on 
the same straight line segment of the graph. Thus the slope of that line segment 
is +|Dy|, which is greater than 2* in absolute value. 


Example 9. Let [a,b] be an arbitrary interval, and let h be the linear function 
h(x) = mx +c. For every € > 0 and for every n > 1, there exists a continuous, 
piecewise linear function g on [a,b] such that g(a) =h(a),p(b) = h(d), 
\|h— loo <€, and |Dg| > n. 


Choose an integer k such that 2-* <e¢ and 2‘—|m|>n. Using example 8, 
we find a continuous, piecewise linear function 7 such that p(a) = 0 = p(b), 
Plloo = 27*, |Dp| > 2%. Define p =h + yw. Clearly, ||h — loo = |IP|loo = 27* < 
e and, for x€ [a,b], |D*p(x)| =|D*p(x) + m| > |D*p()| — |m| = [Dp| — 
|m| > 2*—|m| >n. @ 


Lemma 4.6.17. For each n > 1, §,, is nowhere dense in C[0, 1]. 


Proof. Let f € C[0,1] and let € > 0. We will show that there is a continuous, piecewise 
linear function g such that ||f—glloo < € and g ¢ ®,,. Since continuous, piecewise 
linear functions are dense in C[0,1] (see example 1 in section 4.5), let h be a 
continuous, piecewise linear function such that h(x;) = f(x;) for 0 <j <M and 
Ilf— Allo < €/2. For 0 <j <M-—1, let h; be the restriction of h to [x;,x;41]. By 
example 9, for each j we construct a piecewise linear function gp; such that ||h; — 
Pilloo < €/2 and |De;| > n. We define the required function g by pasting together 
the functions 9;. The function g is continuous because 9;(x;) = flxj) = 9j410%)). 
Now |lf— $lleo <llf—hlloo + Il glloo <. 


The following result follows from remark 1, lemma 4.6.16, lemma 4.6.17, and 
theorem 4.6.9. 


Theorem 4.6.18. Continuous, nowhere differentiable functions are dense in 
e[0,1]. a 
Exercises 
1. Prove that R” and C” are complete. 


2. Prove that /© is complete. 
3. Prove that the space cy of null sequences is complete. 
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4. 


5. 


10. 


11. 


12. 


13. 


14. 


15. 
16. 


17. 


18. 


19. 
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Define a metric on N as follows: d(m,n) = [= = = Prove that d is an 
n m 

incomplete metric. 

For x,y € R, define d(x,y) = |tan~'x—tan~'y|. Prove that d is an incom- 


plete metric on R. Use the identity tan~'x — tan7'y = tan~\(—). 
xy 


. Let A bea dense subset of a metric space X such that every Cauchy sequence 


in A is convergent to a point in X. Prove that X is complete. 


. Prove that if (,,) and (y,,) are Cauchy sequences in a metric space, then 


A(X, Vn) converges. 


. Prove the converse of the Cantor intersection theorem. Hint: Let (x,,) be a 


Cauchy sequence. For each n EN, let A, = {%,,Xn41,---}, and let F,, = AS 
Show that lim, diam(F,,) = 0. 


. Prove that a subset A of a metric space is nowhere dense if and only if every 


nonempty open set U contains a nonempty open subset V such that VN 
A=@. 

Show that a closed subset F of a metric space X is nowhere dense, if and 
only if X — F is dense. 

Show that the boundary of a closed subset F of a metric space X is nowhere 
dense and give an example to show that the assumption that F is closed 
cannot be omitted. 

Let X be a complete metric space, and let {F,,} be a countable collection of 
closed, nowhere dense subsets of X. Is U%_, F,, necessarily nowhere dense? 
Prove that a contraction on a metric space is continuous. Notice that this 
fact was used in the proof of theorem 4.6.12 

Prove that the metric d in the proof of theorem 4.6.13 is complete. 

Prove that the function T, in the proof of theorem 4.6.13 is continuous. 
Let g: [a,b] > R and K: [a,b] x [a,b] > R be continuous functions. 
Show that when |a| is small enough, the integral equation 


b 


yx) = a | K(x, Hy(tdt + g(x) 


has a unique solution in C[a, b]. 

Show that the fixed point of T found in the proof of theorem 4.6.14 is 
unique. 

Show that the function T,, in the proof of theorem 4.6.15 is continuous. 


Definition. An n Xn matrix A = (a;) is said to be diagonally dominant 
if, for each 1 <i <n, |a;;| > pare |a;. 
Prove that a diagonally dominant matrix is invertible. Hint: If 0 4 x € R” 
is such that Ax = 0, let i be such that |x;| = max, <j<,|x;|. Now write down 
the i” equation of the system Ax = 0. 
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Numerical approximations of linear elliptic partial differential equa- 
tions often lead to matrix equations with a very large, sparse, diagonally 
dominant matrix. Iterative solutions are practical in this situation, and the 
method described in the problem below, the Jacobi iteration, is one of the 
simplest (and the slowest). 

20. Let A be a diagonally dominant matrix, and consider the system Ax = b. 
Write A as follows: A = D +L + U, where 


0 ay « Ain 0 bis a 0 
0 a 0 
U= ,L=| * ; 

An—1,n 

O... 0 Ani + Ann-1 0 
ayy 

H= 
Ann 


Define J= —D~'(L+ U). Show that the function T: R"” > R" defined 
by Tx =jJx+D7'b is a contraction. Conclude that the iteration x; = 
Tx,_1,k = 1, converges to the solution of the system Ax = b. Hint: Examine 
the matrix norm ||J||,, defined in section 3.6. 


4.7 Compactness 


A clear manifestation of sequential compactness can be seen in examples 7 and 8 in 
section 1.2, where we proved the boundedness of continuous functions and their 
uniform continuity on a compact interval. We urge the reader to re-examine these 
two examples. This section opens with the topological (non-sequential) definition 
of compactness and the establishment of the general characteristics of compact 
spaces. This is done in order to avoid the duplication of definitions and results in 
the corresponding section in chapter 5. The various equivalent characterizations 
of compact metric spaces are discussed, and then we prove two famous theorems: 
Tychonoff’s theorem and the Heine-Borel theorem. The section concludes with an 
illuminating application on closed convex sunsets of R”. 


Definition. A metric space X is said to be compact if every open cover of X 
contains a finite subcover of X. The definitions of open covers and subcovers 
have been stated in section 4.5. 


Example 1. The collection Uf = {(—n,n) : n € N} of open subsets of R is an open 
cover of R that contains no finite subcover. Therefore R is not compact. 
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Example 2. The sequence of open intervals U = {I,, = (1/n,1—1/n) : n > 3} cov- 
ers the open interval (0, 1), but no finite subset of 2£ covers (0, 1). Thus (0, 1) is 
not compact. 


Definition. Let Kbea subset of a metric space X. We say that K is a compact subset 
(or acompact subspace) of X if it is compact in the restricted metric. 


Theorem 4.7.1. A subset K of a metric space X is compact if and only if it satisfies 
the following condition: if U is a collection of open subsets of X such that 
KCU{U: UE Uh, then there exists a finite subcollection {U,, Up, ...,U,,} of U 
such that K C Ui_, U,,. 


Proof. Suppose K is compact and that KCU{U: UE U}. Then K=U{KNU: 
Ue U}, and each of the sets K NM U is open in K. Therefore, for a finite 
subcollection {U,, Uz, ...,U,}of UK =U{KNU; : 1 <i<n}. Thus K CUL,U;,. 
‘The proof of the converse is left as an exercise. 1 


Example 3. The set K = {1/n : n © N}U {0} is compact. Suppose {U, : a € I} is 
an open cover of K, and let 0 € U,,. Since lim,, 1/n = 0, there is an integer N 
such that a, € Ua, for alln > N. Fori=1,...,N, choose members Ua,» +++ Vay 
of Uf such that x; € Uy,. Clearly, K C Ug Ug,. # 


Theorem 4.7.2. A closed subspace K of a compact space X is compact. 


Proof. Let U be a collection of open subsets of X whose union contains K. 
Then U* = UU{X— K} is an open cover of X. Therefore there exists a finite 
subcollection U' of U* that covers X. There is no loss of generality in assuming 
that X K€ UW’. Thus X = (X — K) UUjL, U;, where each U; € U. Since K C X, 
and K does not intersect X — K, K C Ui, U;. This proves that K is compact. 


Theorem 4.7.3. A compact subspace K of a metric space X is closed and bounded. 


Proof. We prove that X—K is open. Let x € X—K. For every point y € K, there 
exist disjoint open subsets U, and V, of X such that x € U, and y € V,. Now KC 
UyexV,, So, by the compactness of K, there are finitely many points y,,V2,--..Yn © 
K such that K C Uj, V,,. Now let U = Nj U,,. Clearly, KN U = @, thusx € UC 
X—K, and hence X—K is open. We leave the proof that K is bounded as an 
exercise. Wi 


Theorem 4.7.4. The continuous image of a compact space is compact. 
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Proof. Let (X,d) be a compact space, and let (Y,e) be a metric space. We show that 
if f : X > Y is a continuous surjection, then (Y,e) is compact. Let {V,} be an 
open cover of Y. Since f is continuous, f~'(Vq) is open in X for each a, and hence 
{f-'(Vq)} is an open cover of X. The compactness of X yields a finite subcover 
{f-'(Va, Hei of X. Clearly, {Vq,}1 covers Y. i 


Definition. A metric space X is sequentially compact if every sequence in X 
contains a subsequence that converges in X. 


Definition. A metric space X has the Bolzano-Weierstrass property if every 
infinite subset of X has a limit point. 


Theorem 4.7.5. A metric space X is sequentially compact if and only if it has the 
Bolzano- Weierstrass property. 


Proof. Let A be an infinite subset of a sequentially compact space X. Then A contains 
a sequence (x,,) of distinct points. By assumption, (x,) contains a subsequence 
that converges to a point, x. By theorem 4.2.6, x is a limit point of A. 


Conversely, suppose X has the Bolzano-Weierstrass property, and let (x,) be a 
sequence in X. If the range A = {x,,Xp,...} of (x,) is finite, then (x,) contains 
a constant subsequence, which is clearly convergent. So, suppose A is infinite. By 
assumption, A has a limit point, x. Let n; € N be such that d(x,,,x) < 1. Having 
found positive integers ny <n, < ... <n such that d(x,,,x) < 1/i, for1 <i<k, 
we pick an integer n+, > nx such that d(xp,,,,x) < 1/(k +1). Such an integer 


: : 1 : 
exists because otherwise, for every n > nz, we would have d(x,x,) = a which 
is impossible since x is a limit point of A. By construction, lim,x,, = x. 


Definition. A metric space X is totally bounded if, for every € > 0, there exists 
a finite subset {x,,...,x,,} of X such that X = UL, B(x;,€). The set {x,, ...,x,} is 
called an €-dense subset of X. 


Theorem 4.7.6. A metric space X is sequentially compact if and only if it is complete 
and totally bounded. 


Proof. Suppose X is sequentially compact. Let (x,,) be a Cauchy sequence in X. By 
assumption, (x,) contains a subsequence (x,,) that converges to a point x. By 
theorem 4.6.3, lim,,x, =x, and X is complete. If X is not totally bounded, then 
there exists € > 0 such that if F is a finite subset of X, then X # UyepB(x,€). Pick a 
point x, € X, anda point x, € B(x), €). Since B(x,,€) U B(x2,€) # X, there exists a 
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point x; € X such that d(x3,x;) > €,i = 1,2. Continuing this construction yields a 
sequence (xX,,X2,...) of X such that d(x,,X) = € for alln,m € N,n 4 m. Clearly, 
(x,) contains no convergent subsequence, which contradicts the sequential 
compactness of X. 


Suppose X is complete and totally bounded. We claim that X has the Bolzano- 
Weierstrass property. The proof will be complete by theorem 4.7.5. Let A be an 
infinite subset of X. The total boundedness of X allows us to cover X by a finite 
collection closed balls of radius 1. One of the balls, B,, contains infinitely many 
points of A. Define F, = B,, and A, = AN B,. Now cover X by a finite collection of 
closed balls of radius 1/2. One of those balls, By, contains infinitely many points of 
A, and hence infinitely many points of A. Define F, = By, F,, and Ay = A, NB. 

Continue by induction to construct a sequence of closed subsets F, D F,; D F3 2... 

such that diam(F,,) < 2/n and each F,, contains infinitely many points of A. By the 
Cantor intersection theorem, let {x} = N?_,F,,. Since lim, diam(F,,) = 0, any ball 
centered at x contains F,, for sufficiently large n. Since F,, contains infinitely many 
points of A, x is a limit point of A. 


Definition. Let U£= {Ug} be an open cover of a metric space X. A Lebesgue 
number for U is a positive number a such that every subset A of X of diameter 
less that a is contained in one member of U. 


Theorem 4.7.7. In a sequentially compact metric space X, every open cover of X has 
a Lebesgue number. 


Proof. Suppose that there is an open cover U={U,} of X that does not have a 
Lebesgue number. We show that X is not sequentially compact. By assumption, 
for every nEN, there exists a subset A,, of X such that diam(A,) < 1/n, and 
A,, is not contained in any member of U. For each n EN, pick a point x, € A,. 
We claim that (x,) has no convergent subsequence. Suppose, contrary to our 
claim, that some subsequence (x,,) of (x,) converges to x. Since UgUg =X, 
there exists a member Uy of U that contains x, and since Uy is open, there 
is a number 6 >0 such that B(x,5) C Uy. Now choose a positive integer K 
such that d(x,,,.,x) < 6/2, and 1/nx < 6/2. Ify E A,,, then d(x,y) < d(x, Xn) + 
AX) < 6/2 + diam(A,,.) < 6/2 + 6/2 = 6. This implies that A, C B(x,6) © 
U,, which is a contradiction. @ 


Theorem 4.7.8. Every sequentially compact metric space is compact. 
Proof. Let X be sequentially compact, and let U={U,} be an open cover of X. 


By theorem 4.7.7, U has a Lebesgue number, a. Let € = a/3. By theorem 4.7.6, 
there exists a finite subset {x,,...,x,} of X such that U7_,B(x;,€) = X. For each 
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1 <i<n,diam(B(x;,€)) < 2€ < a. Therefore each ball B(x;,€) is contained in a 
member Ug, of U. Clearly, X = Uj2,Uq,. 


Theorem 4.7.9. For a metric space X, the following are equivalent: 


(a) X is compact. 

(b) X is sequentially compact. 

(c) X has the Bolzano- Weierstrass property. 
(d) X is complete and totally bounded. 


Proof. In light of theorems 4.7.5, 4.7.6, and 4.7.8, we only need to show that (a) 
implies (b). Let (x,) be a sequence in X. Define Ay = {Xp,Xn41,---$ and let F, = 
A, Clearly, {F,,} is a descending sequence of closed nonempty sets. If NnenF, = ©, 
then Unen(X — F,,) = X. Thus (X — F,,) is an ascending sequence of open subsets 
that covers X. Therefore X = X — F.,, for some positive integer n, and hence F,, = 
@. This contradiction shows that NnenF, # ©. Let x € NnenF,. Observe that x 
is a closure point of each of the sets A,. Since x € A,, there exists an integer 
ny = 1 such that d(x,,,x) < 1.Nowx € Anceis thus there is an integer ny >n, +1 
such that d(x,,,x) < 1/2. Having found a sequence of positive integers n, < ny < 
... <n, such that, for 1 <i<k, d(x,,,x) < 1/i, choose an integer nj, > +1 
such that d(x;,,\5 


lim, x, =x. 


1 ns ‘ ~—— : 
x)< ia This is possible because x € A, 4. By construction, 


Theorem 4.7.10 (Tychonoff’s theorem). The product of finitely many compact 
metric spaces is compact. 


Proof. It is enough to show that the product of two compact metric spaces X and 
Y is compact. Let (x,,V,) be a sequence in XX Y. Since X is compact, there is a 
subsequence (X,,) of (x,) that converges to x € X. Since Y is compact, there exists 
a subsequence Vn, of (Vn,) that converges to a point y € Y. Now Xn, Yny,) isa 
subsequence of (X,,),) that converges to (x,y) as p > oo. 


Example 4. The convex hull, C, of a compact subset K € R” is compact. 


Let T,, be the standard n-simplex, which is compact by problem 21 at the end of 
this section. Consider the function F : T,, x K"*! > C defined by 


n 
FA, ---:AqsXos «+s Xy) = Axis where (Ap, ...,4,,) ET, and Xo, ...,X, € K. 
i=0 
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The continuity of Fis straightforward, and Fis surjective since, by Carathéodory’s 
theorem, every point in C is a convex combination of at most n+ 1 points in K. 
By Tychonoff’s theorem, the set T,, x K"*! is compact. The result now follows 
from theorem 4.7.4. @ 


Theorem 4.7.11 (the Heine-Borel theorem). A subset K of IR" is compact if and 
only if it is closed and bounded. 


Proof. A compact subset of any metric space is closed and bounded by theorem 
4.7.3. Conversely, suppose K € IR" is closed and bounded; K is contained in some 
rectangle I, X ... XI,; where each I; is a closed bounded interval in R. By theorem 
1.2.10, each I; has the Bolzano-Weierstrass property and hence is compact by 
theorem 4.7.9. By Tychonoff’s theorem, I, X ... XI, is compact and, by theorem 
4.7.2, K is compact. 


Remark 1. In the above proof of the Henie-Borel theorem, it is tacitly assumed 
that the metric involved is the product metric as defined in section 4.4. This is 
largely a matter of convenience. We may show that K is closed and bounded in the 
1-norm or the Euclidean norm.® See example 6 following theorem 4.3.8. 


Theorem 4.7.12. A continuous real-valued function f on a compact space X is 
bounded and attains its maximum and minimum values. 


Proof. By theorem 4.7.4, f(X) is a compact subset of R. By the Heine-Borel theorem, 
f(X) is closed and bounded. Therefore f is a bounded function. Since f(X) is closed, 
it contains its least upper and greatest lower bounds. Therefore MaXxexf(x) = 
SUPyexf{x) is in f(X), and hence the maximum value of f is attained. The same 
reasoning shows that the minimum value of f is attained in X. 


Definition. A metric space X is locally compact if every point in X belongs to the 
interior of a compact subset of X. Thus, for every x € X, there is an open subset 
V of X such that x € V and Vis compact. 
Theorem 4.7.13. R” is locally compact. 
Proof. Any point x = (x), ...,X,) € R" is contained in the open rectangle 
V=(x,-1,x,+1)x...X(,—-1,x%, +1) 


° We will see in section 6.1 that all norms on R” are equivalent. Thus if a set K is closed and bounded 
in one norm on R", then it is closed and bounded in any norm on R”. 
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and 
V=([x, -1.x, +1]x... X[x,—- 1.x, +1] 


is compact. 
Example 5. Q is not locally compact. 


Since every open subset of Q is the union of sets of the form (a,b)N Q, 
where a,b € R, it is enough to show that a set of the form I= [a,b]NQ is 
not sequentially compact. Choose an irrational number r € (a, b), then choose 
a sequence x, €I such that lim,,x, =r. Clearly, no subsequence of (x,) is 
convergent in I. @ 


Example 6. The metric space J® is not locally compact. It is enough to show that 
the closed unit ball B= {x €I™ : ||x||,, < 1} is not compact (see problem 8 in 
the section exercises). As B contains the canonical vectors e,, of K(N), since 
d(e,,€) = 1 ifn 4m, the sequence (e,,) in [° does not contain a convergent 
subsequence. 


The proof of the following theorem is left as an exercise. 


Theorem 4.7.14. The product of finitely many locally compact spaces is locally 
compact. 


Excursion: Closed Convex Subsets of R” 


Example 7. Let Kbea compact subset of R”, and let a € IR" — K. Then there exists 
a point z € K such that ||z— al|, = dist(a, K). The point z is the closest point in 
K to a. 


Define a function f : K— R by f(x) = ||x—al|,. Clearly, f is continuous. By 
theorem 4.7.12, fis bounded and attains its minimum value in K. Thus there 
is a point z € E such that f(z) = ||z—al|, = min{f(x) : x © K} = dist(a,K). 


Example 8. Let C be a closed subset of R”, and let a € R” — C. Then there exists 
a point z € C such that ||z — al], = dist(a, C). If, in addition, C is convex, then z 
is unique. 


Let B be a closed ball of radius r centered at a, and assume r is large enough so 
that BN C# @. The set K = BM Cis a closed and bounded subset of R”. By the 
Heine-Borel theorem, K is compact. By the previous example, there is a point 
z € Ksuch that d = ||z—al|, = dist(a, K). Since d < rand ||x — al|, > r for every 


156 FUNDAMENTALS OF MATHEMATICAL ANALYSIS 


vector x € C—B, ||a —z||, = dist(a, C). We leave it as an exercise to show that z 
is unique when C is convex. 


Example 9 (the obtuse angle criterion). Let C be a closed convex subset of R”, 
let a € R" —C, and let z be the closest element of C to a. Then, for every y € C, 
(a—Z,y—z) <0. Here (.,.) is the Euclidean inner product on R". 


Without loss of generality, assume that y 4 z. Consider the quadratic function 
g(t) = || -Az+ty—all3,0<t<1. 


Observe that 
p(t) = ||(z— 4) + ty —2)|13 = |le— all3 + 2t(z — a, y—z) + Ply —2[3. 


Because C is convex, (1—f)z+ ty € C for every 0 <t< 1. Since z is the clos- 
est point in C to a, p(0) < p(t) for every 0<t<1, and ¢ is increasing on 
[0,1]. This can happen only if p’(0) > 0. Thus 2(z—a,y—z) > 0, and hence 
(a-—Zy—z)<0.4 


(a~Z,y-2) 
lla—llalle—2lh 
The condition (a — z,y—z) < 0 is equivalent to saying that 0 is at least 90°, hence 


the name obtuse angle criterion. Figure 4.3 illustrates the geometry. The wedge- 
shaped region depicts the convex set C, and the rest of the diagram is self- 
explanatory. 


Observe that if @ is the angle between a — z and y— z, then cos@ = 


Figure 4.3 The obtuse angle criterion 
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We know (see theorem 4.2.12) that a closed subset in a metric space can be 
separated from a point outside it by disjoint open sets. In R”, a closed con- 
vex subset can be separated from a point outside it in a much stronger and 
more specific way. They can be separated by a hyperplane, as the next example 
illustrates. 


Example 10. Let Cbea closed convex subset of R”, and let a € R” — C. Then there 
exists a hyperplane n’x = b such that ny < b for every y € C,and n'a > b. Thus 
C is contained in one of the open half-spaces determined by the hyperplane, 
and a is contained in the other open half-space. 


Let z EC be the closest point in C to a, and let m = (a+z)/2. We show that 
the hyperplane we seek is the hyperplane that contains m and is normal to the 
vector n = a— z. Without loss of generality, assume that m = 0 or, equivalently, 
zZ=-—aand n= 2a.’ 

By the previous example, for every yEC, (y—z,a—z) <0, and hence 
(y—z,n) <0. Thus (y,n) < (z,n) = (—a,2a) = —2|la||} <0, and n'y <0. On 
the other hand, n’a = (a,n) = (a,2a) = 2|lal|; > 0. @ 


Remark 2. A direct consequence of the above example is the following. Under 
the assumptions of example 10, there exists a unit vector u and a real number b 
such that, for all y€ C, uly <b < ula. 


Theorem 4.7.15. A closed convex subset C of IR" is the intersection of the closed 
half-spaces containing C. 


Proof. We only need to show that C contains the intersection of the closed half-spaces 
containing C. The reverse containment is obvious. If a € C, then, by the previous 
example, there is a hyperplane n'x=b such that n'y <b for all y€C and 
n'a> b. Thus C is contained in the closed half-space H= {x € R" : n™x < b}, 
buta é H.@ 


Definition. A hyperplane M is said to be a supporting hyperplane of the closed 
convex set C C R” if C is contained in one of the closed half-spaces determined 
by M, and C N M#@. The closed half-space determined by M that contains 
C is called a supporting half-space of C. Observe that every point in CN M is 
necessarily a boundary point of C. 


7 We can translate C by —m. Specifically, we look at the set C’ = {x—m : x € C} and the point 
m' = 0. This translation preserves all the properties of C but has the advantage that the hyperplane we 
seek has a homogeneous equation. This simplifies the algebra. 
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Example 11. Every tangent line to the unit circle is a supporting line of the closed 
unit disk. The line y=x+1 is a supporting line of the closed unit square 
S=[0,1] x [0,1]. Slight rotations of the line about the point (0,1) are also 
supporting lines of S. Thus there are infinitely many supporting lines of S at 
the point (0, 1). The line x = 1 is also a supporting line of the square. 

Example 12. In the notation of example 10, the hyperplane n?x = n"z is a sup- 
porting hyperplane of C. 


We conclude this section with the following fine application of compactness. 


Theorem 4.7.16 (the supporting hyperplane theorem). Suppose z is a boundary 
point of a closed convex set C C R". Then there exists a supporting hyperplane M 
of C that contains z. 


Proof. Let a, be a sequence in IR" — C such that lim,,a,, = z. By remark 2, there is a 
sequence u,, of unit vectors such that, 


for every y € C,uny < unay. 


By the compactness of the unit sphere in IR", (u,,) contains a convergent subse- 
quence, which we continue to call (u,,) for simplicity. Let u = lim,,u,. Taking the 
limit of the two sides of the above inequality, we have u'y < u'z for all y € C. The 
hyperplane M orthogonal to u and containing z is the one we seek. @ 


Exercises 


1. Prove directly that a compact metric space X is bounded. 

2. Let (x,,) be a convergent sequence in a metric space X, and let x = lim, x,. 
Prove that the set {x,,}° , U {x} is compact. 

3. Let f : [0,1] ~ R be continuous. Prove that the graph of f, {(x,f(x)) : x€ 
[0, 1]}, is compact in R?. 

4. Prove that if X is a compact metric space and F, D F, D ... is a descending 
sequence of nonempty closed subsets of X, then N°) F,, # ©. 

5. Consider the space cg of null sequences, endowed with the supremum 
norm. Prove that a bounded subset A of cy is totally bounded if and only if, 
for every € > 0, there exists a natural number N such that |x,,| < € for every 
n> Nandeveryx€A. 

6. Prove that ifa subset A of a metric space is totally bounded, then Ais totally 
bounded. 

7. Prove that a totally bounded metric space is separable. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 
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. Prove that, in a normed linear space X, the closed unit ball 


BH={xEX: |x] <b 


is compact if and only if any closed ball in X is compact. 


. Prove that a normed linear space X is locally compact if and only if the 


closed unit ball is compact. In this case, show that the unit sphere {x € X : 
||x|| = 1} is compact. 


. Prove that the product of finitely many locally compact spaces is locally 


compact. 


. In connection with example 8, prove the point z is unique when C is convex. 
. Let Fbe a compact subset of a metric space X, and let a € X — F. Prove that 


there exists a point z € F such that d(z, a) = dist(a, F). Also give an example 
to show that z is not necessarily unique. 

Let F be a closed subset of a locally compact normed linear space X, and let 
a © X —F. Prove that there exists a point z € F such that d(z, a) = dist(a, F). 
Let K be acompact subset of a metric space X. Prove that there exist points 
x,y € K such that d(x, y) = diam(K). 

Show that if E is a compact subset of a metric space X and F is closed in X 
and disjoint from E, then dist(E, F) > 0. 

Let A be a subset of a metric space (X,d). For € > 0, define 


Ag = Uxea B(x, €). 


Prove that 
A, ={x €X : dist(x,A) < e}. 


Also show that if E is a compact subset of X and F is closed in X and disjoint 
from E, then E, NF, = @ for some € > 0. 

Show that if E and Fare disjoint compact subsets of R”, then there are points 
x € Eand y € F such that d(x, y) = dist(E, F). 

Let E and F be disjoint compact convex subsets of R”. Show that there exists 
a hyperplane u?x = b such that u?x > b for every x € E, and u'x <b for 
every x € F. 

Prove that a closed convex subset C of R” is the intersection of the closed 
supporting half-spaces that contain C. 

Find a countable set of closed supporting half-planes whose intersection is 
the closed unit disk. 

Prove that the standard n-simplex in R” is compact. 
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4.8 Function Spaces 


We already encountered several examples of function spaces. We mention two 
examples here before we embark on a more general study of function spaces. An 
early example is the space /, which is nothing but the space of bounded functions 
from N to the base field K. A little reflection reveals that the same definition makes 
sense when N is replaced with an arbitrary set X. Another space we studied in some 
detail is the space C[0, 1]. We also studied several of the properties of © and C[0, 1], 
such as completeness and the lack of local compactness. We start the section with 
the definition of a number of important function spaces that generalize 1° and 
€[0, 1] in particular. 


Definition. Let X be a nonempty set, and define B(X) to be the set of all 
bounded, real or complex, functions on X. Define vector addition and scalar 
multiplication in B(X) by (f+ g)(x) = f(x) + g(x), (af)(x) = af(x). Here f and 
g are bounded functions, x € X, and a € K. The supremum norm (also the 
uniform or co-norm) of a function f € BCX) is defined by 


Iflloo = SUP xexlf)I- 


It is a straightforward exercise to verify that B(X) is a vector space and that the 
function ||.||,. is a norm. Observe that it is not assumed that X is necessarily a 
metric space. It is sometimes necessary to specify the scalar field. In this case, we 
use the notations B(X, R), and BCX, C) to indicate whether we wish to consider 
real or complex valued functions. 


Definition. Let X be a metric space, and define C(X) to be the set of continuous 
real or complex functions on X. The operations on C(X) are defined pointwise 
as in the above definition. This clearly makes C(X) into a vector space; see prob- 
lem 1 on section 4.3. However, since a continuous function is not necessarily 
bounded, the supremum norm is not necessarily defined on C(X). 


Definition. Let X bea metric space. The space of continuous bounded functions, 
denoted by BC(X), is the intersection of B(X) and C(X). It is a normed subspace 
of B(X). In the special case when X is a compact metric space, C(X) € B(X), 
and C(X) = BC(X). 


Theorem 4.8.1. The space B(X) of bounded functions on a set X is a complete 
normed linear space. 


Proof. We only prove the completeness of B(X). Let (f,,) be a Cauchy sequence in 
B(X), and let € > 0. There exists a natural number N such that, for m,n > N, 


THE METRIC TOPOLOGY 161 


IWtn —Smlloo = SUP xexl nO) —Sn(%)| <€. In particular, (f,(x)) is a Cauchy 
sequence in KK for each x € X. Therefore lim, f,,(x) exists. Define f(x) = lim, f,(x). 


We claim that f € B(X). Since f,, is Cauchy, there exists NEN such that, for 
n>N,\lfn—Frlleoo <1. Consequently, for all x EX, and all n>N, |f,)| < 
Lal) — fl + UI S IWfa —Filleo + Ufulleo < 1+ Ifvllee: Taking the limit of 
the quantity on the extreme left of the above string of inequalities, we obtain 
[Kx)| < 1+ |lfvlloo. Thus fis a bounded function. 


Finally, we show that lim,,f, = fin B(X). Let € > 0. There exists N € N such that, 
forn,m> Nand for allx € X, |f,(«) —fi,(%)| < €. Taking the limit asm — oo, we 
obtain |f,(x) — f(x)| < € for all x € X, and alln > N. This means that ||f, —flloo < 
€ for all n > N, and the proof is now complete. 


Theorem 4.8.2. If X is a metric space, then the space BC(X) of continuous bounded 
functions on X is a complete normed linear space. 


Proof. Since BC(X) is a subspace of B(X), it suffices, by theorems 4.8.1 and 4.6.4, to 
show that BC(X) is closed in B(X). Let f € B(X) be a closure point of BC(X). We 
need to show that f is continuous. For € > 0, there exists a function g € BC(X) such 
that ||f— glo <€/3. Fix xy € X, and let 6 > 0 be such that d(x, x9) < 6 implies 
that |g(x) — g(xo)| < €/3. Now if d(x, x) < 6, then 


Rx) — f%0)] S OD — gD] + |g) — g%o)| + Igo) —f%0)| <€- 
This proves that f is continuous at xo. 


Definition. A function f : X > K from a metric space X to the base field K is 
uniformly continuous if, for every € > 0, there exists a number 6 > 0 such that, 
for all x,y € X with d(x, y) < 6, [{x) —fy)| <e. 


What distinguishes uniform continuity from continuity is that 6 in the above 
definition does not depend on x. 


Theorem 4.8.3. A continuous (real or complex) function f on a compact metric space 
X is uniformly continuous. 


Proof. Let ¢ > 0. For every x € X, there exists 6, > 0 such that whenever d(x,€) < 
5, [flx) —f(S)| < €/2. Now X = UxexB(x, 5,.). Let 35 be a Lebesgue number for 
the open cover {B(x,6,) : x € X}. For each §,n €X with d(&,n) <6, B(E,5) 
contains n and has diameter < 36. By the definition of a Lebesgue number, 
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there exists x € X such that B(&,6) C B(x,6,). In particular, d(,x) < 6,, and 
d(n,x) < 6,. Consequently, |f(€) —f(y)| < |AE) —f)| + [fC —f(y)| <¢. 


The next result uses function spaces to provide an elegant and succinct proof of 
the existence of the completion of an arbitrary (incomplete) metric space. 


Theorem 4.8.4. Let X be a metric space. Then there exists a complete metric space 
X and an isometry p : X — X such that p(X) is dense in X. 


Proof. We know that B(X,R) is a complete metric space. We will find an isometry 
g : X > B(X,R). The theorem follows by taking X = y(X). To this end, fix an 
element a € X. For € € X, define a function pe : X > R by 


pe(x) = d(x, §) — d(x, a). 


By the triangle inequality, |pe(x)| = |d(x,§) — d(x, a)| < d(a,&) for all x EX. 
Therefore pe is bounded. We now show that the map + g¢ from X to B(X,R) 
is an isometry. Specifically, we need to show that, for §,n © X, \|Pe — Pylloo = 
d(§,7). 
Now ||Pz — Py lloo = SUP xexlPe(X) — Gy (%)| = suprexla(x, 5) — d(x,)| < dE). 
Therefore \le — Pplleo < d(E,7). 

Since |pe(E) ~ Gy(€)| = AE, n), [I~ — Fallen = 4€E,n). as desired. Ml 


Commonly used language to describe the conclusion of the previous theorem is 
that X is isometrically embedded in X. By identifying a point € EX with PE EX, 
we often think of X as a subset of X. We employ this convenience in the next 
theorem. 


Definition. The space X that we just constructed is called the completion of X. It is 
the (unique) smallest complete metric space that contains X as a dense subspace. 
The following theorem frames that concept. 


Theorem 4.8.5. Let (Y,e) be a complete metric space space, and let p : X > Y be 
an isometry from a metric space X into Y such that p(X) is dense in Y. Then p 
can be uniquely extended to an isometry p : X > Y. 


Proof. Let x€ X, and choose a sequence (X,) in X such that lim,x, =x. In 
particular, (x,) is Cauchy. Because 9 is an isometry, p(x,,) is a Cauchy sequence 
of Y. By the completeness of Y, p(x,,) converges to a point dependent on x. Define 
(x) = lim, p(x,). The reader should verify that the function @ : X > Y is well 
defined in the sense that it depends only on x and not on the particular choice of 
the sequence (x,,). See problem 4 on section 4.1 


THE METRIC TOPOLOGY 163 


To show that 9 is onto, let y € Y. Since p(X) is dense in Y, there exists a sequence 
x, in X such that limg(x,,) = y. Again, because @ is an isometry, (x,) is Cauchy 
in X. Since X is complete, (x,) converges to a point x € X. By the very definition 


of 0, P(x) = y. 


Finally, we verify that @ an isometry. Let x,y €X and choose sequences (Xx,) 
and (y,) in X such that lim, x, =x, and lim, y, =y. Since @ is an isometry, 
P(P(X,),PWn)) = d(x,,y,,). Taking the limit of the two sides of the last identity 
gives 


pC), FY) = pllimpC,,),limg(y,)) = lim p(G(X,).00)) 
= limd(x,,y,) = d(x, y). i 


Example 1. We know that the chordal metric v on R is not a complete metric. 
It make sense to ask if the completion of (IR, v7) can be described in concrete 
terms. The answer is rather obvious now. Since (R, 7) is isometric to S!, which 
is a dense subset of S!, the completion of (IR, v) is (isometric to) the circle S!. 
We did use here the fact that the completion of an incomplete metric space is 
unique. See theorems 4.8.4 and 4.8.5. More generally, the completion of (R", v7) 
is the sphere S”. 


We now prove two theorems of great utility: Ascoli’s theorem, which gives neces- 
sary and sufficient conditions for the compactness of a subset of continuous func- 
tions on a compact space in the uniform metric, and the Weierstrass polynomial 
approximation theorem. Later in the book, we will encounter several applications 
of the two theorems. 


Definition. Let X bea metric space. A subset § of C(X) is said to be equicontinu- 
ous at x € X if, for every € > 0, there exists 5 > 0 such that, for every y € X with 
d(x,y) < 6, and every f€ &, |f(x) — fly) < €. We say that % is equicontinuous 
if it is equicontinuous at every x € X. 


Definition. A subset % of C(X) is said to be uniformly equicontinuous if, for 
every € > 0, there exists 6 > 0 such that, for every x,y € X with d(x, y) < 6, and 


every fE &, |x) —fY)| <e. 


Theorem 4.8.6. If X is a compact metric space and & is an equicontinuous subset 
of C(X), then & is uniformly equicontinuous. 


Proof. The proof mimics that of theorem 4.8.3 and is left as an exercise. Ml 
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Theorem 4.8.7 (Ascoli’s theorem). ® Let X be a compact metric space. A subset § 
of C(X) is compact in the uniform metric if and only if & is closed, bounded, and 
equicontinuous. 


Proof. Suppose % is compact. By theorem 4.7.3, % is closed and bounded. We show 
that & is equicontinuous. Let € > 0. The total boundedness of & (theorem 4.7.9) 
guarantees a finite set of functions {f,, ...,f, in & such that & C UL, BG, €/3). 
By theorem 4.8.3, each f;, is uniformly continuous, so there exists 6;>0 such 
that if d(x,y) < 6;, then |f;(x) —fi(y)| < €/3. Let 6 = miny<j<,6;. If f€ &, there 
exists f; such that ||f; —flloo < €/3. Now if x,y © X are such that d(x,y) < 6, then 
Ax) — fy) | < [fod — fi) | + i) —£O)| + lily) —fQ)| < €. This proves that § 


is equicontinuous. We now prove the converse. 


Because of theorems 4.6.4 and 4.7.9, it is sufficient to show that % is totally 
bounded. Let € > 0. Since % is equicontinuous, for every x € X, there exists 5, > 0 
such that, for ally € B(x,6,) and allf € %, |{x) —fQ)| < €/4. By the compactness 
of X, there exists a subset A = {x,,...,X,} of X such that X = Uji, B(x;,6,,). By 
the boundedness of %, the set R= Vis {flx;) : fe B} is a bounded subset of the 
complex plane, and therefore R is compact and hence totally bounded. Thus there 
is a finite set B = {z,,...,Z,} of complex numbers such that R € U;2 ,B(z;,¢/4). 
The following observation is crucial. Consider an arbitrary function f in &. For 
every x; € A, there is a point z; € B such that |f(x;) — z;| < ¢/4. The assignment 
x; +> z; clearly defines a function from A to B.° This suggests that we look at 
the finite set B“ of all functions from A to B. For each g € B“, we define 
a set Fg =NAfE B : [Kx) — P(%)| < €/4}."° By the above observation, 
B=UG— € A®}. We claim that each of the sets @q has a diameter less 
than €. This will complete the proof because we can choose a function fy from 
each nonempty ® g, and then we will have an €-dense subset of §. 


To prove the claim, let f,g € Gg. Since |{x;) — p(x;)| < €/4 and |g(x;) — p(%;)| < 
é/4 for every x; € A, |f(x;) — g(x;)| < €/2 for every x; € A. Now let x € X. Then 
x € B(x;,6,,) for some x; € A, and 


Kx) — gx) S [o) — fi) + Oe) — 8) + Ig) — 8@)| < €. 


Remark. Observe that we did not use the full force of the assumption that % is 
bounded, just that it is pointwise bounded. Problem 8 at the end of this section 
is relevant here. 


® Also widely known as the Arzela-Ascoli theorem. 
* The point z; may not be unique, so we pick one such. 
*° It is possible that Fy = ©. 
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Theorem 4.8.8 (the Weierstrass polynomial approximation theorem). Let g€ 
C[0, 1] and let € > 0. Then there exists a polynomial P such that 


Ile Plloo <é. 


Proof. Observe that the theorem says that the space of polynomials is dense 
in C[0,1]. Without loss of generality, we may replace g with the function 


tx) = g(x) — [g(0) + x(g(1) — 9(0))]. This is because g(0) + x(g(1) — g(0)) is a 
polynomial. Replacing g with f has the advantage that f(0) = f(1) = 0. Extend f to 
R by defining f(x) = 0 when x € [0, 1]. 


Define L,(x) =c, Sf. fet f(1—f?)"dt, where c,!= frac —t)"dt. Since 
fix) = 0 for x € [0,1], 


L(x) = | flx+00 -?)"dt = a AHP -( -— x) ]"dé. (E =x +t) 
Se 0 


‘The last expression makes it clear that L,,(x) is a polynomial of degree < 2n. 


It is a simple induction exercise to show that, for alln € N,(1 — 7)" >1—nt?, so 


1 n yn 2 _ 4 1 ‘ 
Sf, -?)"dt> ania —nt’)dt = ar > 7 In particular, c, <n. 
Now let € > 0. The uniform continuity of f yields anumber 0 < 6 < 1 such that, for 
all x,y with |x—y|<6,|ftx) —fly)| <«¢. Now c, AG —P)"dt < ~n(l — 67)". 
Since lim,,/n(1 — 67)" = 0, we can pick an integer n such that cp SC — ft)" 
dt <€. 


Since i c,(1 — t?)"dt = 1 and, for |t| < 1,c,(1—#)" = 0, 


|L, (x) —f(x)| = lf [Ax +t) — f(x) | — t7)"de| 


<c, | |Ax+0—-fod|( —0)"dt 
-1 


5 
=«, f lfix+t) —f(x)|\ — ?)"dt +c, [fix + t) — f(x)|( — ?)"dt 
-46 |t|>6 
5 


< ety f (= Pyat+ alls f (1 —?t?)"dt 
-36 |t|>6 


1 
<eé + Altthe [ (1—??)"dt <¢+4e||f|,.. i 
6 
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Example 2. @[0, 1] is separable. 


Let A be the set of polynomials with rational coefficients. Clearly, A is countable. 
We show it is dense in C[0, 1]. Let f © C[0, 1], and let ¢ > 0. By the Weierstrass 
theorem, there is a polynomial q = a a;x' such that ||f—q|l.o <¢€/2. For 
each 0 <i<n, choose a rational number r; such that |a;—1;| < ¢/2(n+ 1), 
and define p=)" ra’. Now |lf—Plleo <llf—dlleo + lld—Plloo $€/2+ 
Dizol Ti <e. @ 


The last example and the Weierstrass polynomial approximation theorem have 
far-reaching generalizations. Their proofs require the full power of the Stone- 
Weierstrass theorem. The proof of all three theorems can be found in section 4.9. 


Example 3. Let f € C[0,1] be such that fi x"f(x)dx = 0, for every nonnegative 
integer n. Then f= 0. 


Without loss of generality, assume that f is a real function. The assumption 
implies that % S(x)p(x)dx = 0 for any polynomial p. By the Weierstrass theo- 
rem, there is a polynomial p such that ||f—p||,, < . 

Now | f, Pax| = |S, AF—p)dxl < Sy CDI) — pOd|dx <€ fh Ifldx < ellflleo: 
Since € is arbitrary, f f-dx = 0. The continuity of f forces f= 0. @ 


The discussion so far has been focused on scalar-valued functions, and all the 
function spaces we have studied are normed linear spaces. We now expand the 
discussion and consider functions that take values in a general metric space. The 
next two examples are extensions of theorems 4.8.1 and 4.8.2. 


Example 4. Let (Y,0) be a bounded metric space, and let X be an arbitrary 
nonempty set. For functions f,g : X > Y, define 


DF.8) = Sup rexP(A%), 8%). 


Then D is a metric on the set Y* of all functions from X to Y. If, in addition, o 
is a complete metric, so is D. 


Observe that the definition of D makes sense because ¢ is a bounded metric. 
The verification that D is a metric on Y* is straightforward. Now assume that p 
is complete, and let f,, be a Cauchy sequence in Y*. For € > 0, there is a natural 
number N such that, for m,n > N, supyexP (ful), fin(x)) < €. In particular, for 
an arbitrary x € X, the sequence (f,,(x)) is a Cauchy sequence in Y. By the 
completeness of ¢, f(x) = lim,,f,(x) exists. We now show that f, converges to f 
in the metric D. 
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Let €,N,n, and m be as in the last paragraph. Taking the limit as m — oo, we 
obtain e(f,,(x), f(x)) < €. Since the last inequality holds for all x € X, D(f,,f) < € 
for all n > N. This shows that lim, D(f,,f) = 0. @ 


Example 5. Let (Y,¢) be as in the previous example, and assume that ¢ is 
complete. If (X,d) is a metric space, then the space (C(X, Y), D) of continuous 
functions from (X, d) to (Y,¢) is complete. 


We show that @(X,Y) is closed in (Y*,D). Let fe Y* be a closure point 
of C(X, Y). We need to show that f is continuous. For € > 0, there exists a 
function g € C(X, Y) such that D(f,g) < €/3. Fix x) © X, and let 6 > 0 be such 
that d(x,xy) <6 implies that o(g(x), g(x) < €/3. Now if d(x,x9) <6, then 
(Kx), Axo) < EG(X), (x) + P(e(X), 2(X9)) + P(e(X 9), fl%o)) <€. This proves 


that fis continuous at xp. @ 


Let I= [0,1] be the closed unit interval, and let I? = [0,1] x [0,1] be the closed 
unit square; I is given the usual metric on R, and we give I’ the product met- 
ric e((7,s),(u, v)) = max{|r— u|,|s —v|}. In theorem 4.8.9, we will make use of 
the space C(I, I’) defined in example 5 with the complete metric D defined in 
example 4. 


Application: A Space-Filling Curve 


Let J = [a, b] be an arbitrary closed interval, and let S be an arbitrary closed square. 
We will refer to a function in C(J, S) as a path. We are particularly interested in the 
four types of triangular paths g shown in figure 4.4. The triangles differ only in 
orientation. Specifically, the intervals [a,(a+ b)/2] and [(a+ b)/2,b] are mapped 
linearly onto the straight line segments of the triangle such that g(a) and g(b) are 
adjacent corners of S and g((a + b)/2) is the center of S. See the formula defining 
the path fp in the proof of theorem 4.8.9. 

Before we embark on the task of finding the space-filling curve, we describe 
a special type of operation we need in the proof of the next theorem. Observe 
that the paths g intersect only two of the four sub-squares that result from 
bisecting the sides of S. We define the modified paths g’ as follows. Divide [a, b] 
into four congruent subintervals J; = [a + j(b—a)/4,a+(j+1)(b—a)/4],0 <j< 
3, and map the subinterval J; linearly onto the four triangular paths that make up 
the path g’, as shown in figure 4.5. Observe that the paths g’ intersect all the sub- 
squares of S. 

We are now ready to find the space-filling curve. The statement of the theorem 
below justifies the term space filling. 
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(a) (b) 


Figure 4.4 The triangular paths g 


Theorem 4.8.9. There exists a continuous surjection f from I to I’. 


Proof. We apply the operation discussed above the theorem to construct a sequence 
(f,) that converges to the desired function f. 
Define the path fy : I> P by 


f= (t,t) en 1/2, 
(41-1) if 1/2<t<1. 
Figure 4.4 (a) depicts the path fo. 

Applying the operation described above the theorem, we can find a path f, 
consisting of the four triangular paths shown in figure 4.5(a). Next we apply the 
operation to each of the triangular pieces of f, to produce the path f,. Observe 
that this requires dividing each of the subintervals I; = [j/4,(j + 1)/4] into four 
congruent sub-intervals and modifying the restriction of f, to each I;, depending on 
its orientation, according to figure 4.5. The repeated application of the operation 
produces a sequence of paths (f,,), which we show converges to the space-filling 
curve. The path f,, consists of 4” triangular paths, and each triangle is contained in 
a square of length 2—". The triangular pieces correspond to partitioning I into the 
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(a) (b) 


(c) (d) 


Figure 4.5 The modified paths g’ 


1 


(a) 
| , ‘ 
0 1 


subintervals [j/4",(j + 1)/4"], 0 <j < 4" —1. The paths f,,f;, and f, are shown in 
figure 4.6. 


Figure 4.6 The paths f,,f,, and f, 


A crucial feature of the sequence (f,,) is that if a triangular piece T of the path 
fn is contained in a square S of length 2", then the four triangular pieces of 
fna1 obtained by modifying T are contained in the same square S. Thus, for every 


tel, efi410,f,(D) < 27". Consequently, D(f,41,f,) < 27". This is the crux of 
the proof. 
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Now, for positive integers m,n, ifm > n, then 


D fins tn) < Dfinefn—-v) + Dfin—1-fin—2) 2 Dnata) 


<2--D 4 427-7 < 27-41, Oasn > oo. 


Thus the sequence (f,,) is Cauchy, and the completeness of (C(I, I’), D) guarantees 
that f,, converges to a function f € C(I, LP). 


Since I is compact, the range of fis compact and hence closed in I’. The proof will 


be complete if we show that the range of f is dense in I’. Let x € I’, and let € > 0. 
Choose an integer n such that 2—" < €/2 and D(f,,,f) < €/2. The point x belongs 
to one of the 4" squares that contain the triangular pieces of f,. Let S be such a 
square. If t € [0,1] is such that f,(f) is on the triangular piece contained in S, then 
PC, (D,x) < 27" and 


p(t), x) < pO. f(D) + eG, 0.x) < DG, fp te/2 <€. 


Exercise 


. Let Y be a complete metric space, let A be a dense subset of a metric space 


X, and suppose that f : A > Y is a uniformly continuous function. 
(a) Show that f maps Cauchy sequences into Cauchy sequences. 
(b) Show that fadmits a unique uniformly continuous extension f : X > Y. 


. Give an example to show that the mere continuity of fin the above exercise 


is not enough to guarantee an extension. 


. Prove that the completion of a separable metric space is separable. 
4. Prove that the function ¢ in the proof of theorem 4.8.5 is well defined. 
. Let & bea pointwise bounded family of continuous, scalar-valued functions 


onacomplete metric space X. Thus, for each x € X, sup{|f(x)| : fE B} < oo. 
Prove that there exists an open subset V C X such that sup{|f(x)| : x E Vi fe 
@} < oo. Hint: Let A, ={x EX : |f(x)| <n for every f € GB}. 


. Show that if X is compact and § C C(X) is equicontinuous, then ¥ is 


equicontinuous. 


. Ascoli’s theorem is often applied in the following form. Let X be a compact 


metric space. If a sequence (f,,) of functions in C(X) is bounded and 
equicontinuous, then (f,,) contains a subsequence that converges in C(X). 
Prove this version of Ascoli’s theorem. 


. Let X be a compact metric space, and let % be an equicontinuous family 


of functions in C(X). Prove that if % is pointwise bounded, then % is 
uniformly bounded. Hint: Let g(x) = sup{|fx)| : fe &}. For n EN, let 
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U, ={x EX : g(x) < n}. Prove that U,, is open and that {U,, : n EN} is an 
open cover of X. 

9. Let X be a compact metric space, and let (f,,) be a sequence of equicon- 
tinuous functions that converges pointwise to a function f. Prove that f is 
continuous and that (f,,) converges uniformly to f. 

10. Dini’s theorem. Let X be a compact metric space, and let f, : X > Rbea 
sequence of continuous functions such that, for x € X, f(x) >f,(~) = ..., 
and lim,, f,(x) = 0. Prove that f converges uniformly to the zero function. 

11. Let X be a compact metric space, and let f, : X — IR be a sequence of con- 
tinuous functions such that, for x € X,f,(x) <f,(x) < ..., and lim, f,(x) = 
fix), where fis continuous. Prove that f, converges uniformly to f. 

12. Prove that C[a, b] is not locally compact. 

13. Prove that Ca, b] is separable and that polynomials are dense in C[a, b]. 

14. LetX = C(IR"), and let K; be the closed ball of radius i and centered at 
the origin. Clearly, K, € K, © ..., and U%,K; = R”. Let ||.||; denote the 
uniform norm on @(K;). For a continuous function f: R" > C, [fl]; = 
sup{|f(x)| : x € K;} denotes the norm of the restriction of f to K;. Define 
a metric d on X as follows: 


A(f,g) = >) 2-‘min{l, |lf— gllit- 


i=1 


(a) Show that d is a metric on X. 
(b) Show that, for each i EN, d(f.g) < |[f—g||, +27. 
(c) Show that a sequence of functions (f,,,) in X converges in the metric d 
to f € X if and only if (f,,) converges uniformly to fon compact subsets 
of R”. 
(d) Show that d is a complete metric. 
15. Prove that, for every natural number n, there is a continuous surjection 
from I to I”. 
16. Prove that a countable metric space can be isometrically injected in I™. 
Hint: Examine the proof of theorem 4.8.4. 
17. Prove that every separable metric space can be isometrically injected in I™. 


4.9 The Stone-Weierstrass Theorem 


Like the Weierstrass theorem, the Stone- Weierstrass theorem is an approximation 
theorem. However, the Stone-Weierstrass theorem allows us to prove far-reaching 
generalizations of the results we obtained in section 4.8. Powerful theorems often 
require the development of elaborate machinery and this section demonstrates 
that. 
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Throughout this section, X is a compact metric space, C(X,R) is the space of 
continuous, real-valued functions on X, and C(X,C) is the space of continuous, 
complex-valued functions on X. We use the notation C(X) for either of the two 
spaces when the distinction is immaterial; C(X) is a endowed with the uniform 
norm and, as such, is a Banach space. Additionally, CCX) is an algebra with the 
pointwise multiplication of functions: (fg)(x) = f(x)g(x). See the definition of an 
algebra in section 3.4 


Let A be a subalgebra of C(X) satisfying the following standing assumptions: 


(SA1) A contains all constant functions. 
(SA2) A separates points in X in the sense that if x,y are distinct points in X, 
then there exists a function h € A such that h(x) ¥ h(y). 


Lemma 4.9.1. Let A be a subalgebra of C(X,R) satisfying SA1 and SA2. Then, for 
fg €A, the functions maxtf, gt and mintf, g} are in A (the closure of A). 


Proof. Since maxf{f, g} = (f+ gt = |f- g|, and minff, g} = (f+ gQ- -f-ab and 
since A is a subspace of C(X,R), it is sufficient to prove that |f| €.A whenever 
fEA. Let M=|lfll.o, and let € > 0. By the Weierstrass approximation theorem 
applied to the function g(t) = |t|, there exists a polynomial p(t) = i a;t! (a, € 
R) such that, for allt € |—M,M], | |t| — p(O| < ¢. Consider the function pof = 
ee a;f. Since A is an algebra, pof € A, and since | |f(x)| — p(f(x))| < € for all 


x€X, ||\f]|—pofllo <6 and |f|€A. wl 


Lemma 4.9.2. Let A be a subalgebra of C(X,R) satisfying SAI and SA2, and let f € 
C(X,R). For every y,z € X, there exists a function gy, € A such that g,(y) = fly) 
and gy(z) = fiz). 


Proof. If y =z, define g,,(x) = fly) (a constant function). Otherwise, by SA2, there 
exists a function h € A such that h(y) # h(z). The following function is in A and 
satisfies the requirements: 


h h 
0) =f +0) SOD — py 


Theorem 4.9.3 (the Stone-Weierstrass theorem). Let A be a subalgebra of 
C(X, R) satisfying SA1 and SA2. Then A is dense in C(X,R). 


Proof. It is sufficient to show that A is dense in C(X,R). Observe that A isa 
subalgebra of C(X,R) that satisfies SA1 and SA2. Let f € C(X,R), and let € > 0. 
We will show that there is a function g € A such that ||f—gl|oo <€. For y,z € X, 
let g,,, be as in lemma 4.9.2, and let U,,, = (f—g,,2) ‘(—€,€). By the continuity 
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of f— yz» U,,, is open and, clearly, y,z € U,,,. In particular, for every x € Ujz, 
Ax) < g(x) +¢€ and f(x) > g,,(x) —€. The collection {U,, : y € X} covers X. 
Thus there exists a finite subset {y,, ...,Yn,} of X such that Uj2, U),,, =X. 

Define g, = maxigy,, : 1 <i<n,}, and let V, = aren U,,... The function g, is 


in A by lemma 4.9.1. Observe that 
Kx) < g(x) + € for all x,z € X, 
and 
fx) > g(x) — € for all z © X and all x € V,. 
Now each V, is an open neighborhood of z; hence the collection {V, : z € X} 


covers X. Thus there exists a finite subset {z,, ...,Zn} of X such that Uj~, V,, =X. 
Finally, let 


A little reflection reveals that g(x) —€ < f(x) < g(x) +€ for allxe X. 


The following corollary is a far-reaching generalization of the Weierstrass polyno- 
mial approximation theorem. 


Corollary 4.9.4. If X is a compact subset of IR", then the set A of polynomials with 
real coefficients in n variables is dense in C(X,R). 


Proof. Clearly, A is an algebra, and it contains all constant functions. To show that A 
separates points in X, let x = (x, ...,X,) and y = (1, ...,¥n) be distinct points in 
X. The polynomial p(t, ...,ty) = RC? — xj)” satisfies p(x) = 0, and p(y) > 0. 
By the Stone- Weierstrass theorem, A is dense in C(X,R). 


The following result is the promised generalization of example 2 in section 4.8. 
Corollary 4.9.5. If X is a compact metric space, then C(X,C) is separable. 


Proof. Since compact metric spaces are separable, let {§, : n € N} be a countable 
dense subset of X. Forn EN, define f,(x) = d(x, &,,), and define fy(x) = 1. Let 
M be the set of all finite products of the functions fo, f,,..., and let A be the set 
of all linear combinations with real coefficients of elements in M. Clearly, A is 
a subalgebra of C(X,IR)."* We show that A separates points in X. If x and y are 


"In fact, A is the subalgebra generated by the set {f,,}°°.0, that is, the smallest subalgebra of C(X,R) 
that contains {f,}72o. 
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distinct, let 5 = d(x,y)/4. There exists a natural number n such that d(x, €,) < 6. 
‘The function f,, separates x and y since f,(x) < 6, and f,(y) > 36. 


By theorem 4.9.3, A is dense in C(X,IR). We now show that the countable set 
A,= ya gig; : nEN, Gg; € Q,g; € M} is dense in C(X,R). By the first part of 
the proof, it is enough to show that if f = pe aig; € A ande > 0, then there exists 
an element h€ A, such that ||f—h||,,.<¢€. Let M=max{||gi||~ 1 1<i<nt 
and choose rational numbers q; such that |a;— q;| < ¢/(nM). Set h= 4 FS. 
Clearly, ||f —h|.. < €. 


To show that C(X,C) is separable, let f = f, + if, € C(X, C), and choose functions 
hy and h, in A, such that ||f, — hy|lo. < €/2, and ||fo — ha||oo < €/2. The function 
h=h, +ihy is in A, + iA, and satisfies ||f—h||. < €. Since A, +iA, is count- 
able, the proof is complete. 


Theorem 4.9.3 does not extend to C(X, C), as we show in example 1 below. First 
we need a definition. 


Definition. Let C(S!,C)"” be the space of all continuous complex functions on 
[—7, 7] such that f(—7r) = f(zr). It is clear that C(S',C) is a closed subspace of 
C[—7, 7] when both spaces are given the uniform norm. 


Another way to view the space C(S!,C) is as follows. The restriction of any 
continuous, 27-periodic function g : R > C to the interval [—7, 7] is in the space 
e(S',C). Conversely, any function f€ C(S',C) can be extended by periodicity 
to a continuous, 27-periodic function. Thus the space C(S!,C) is also the space 
of continuous, 277-periodic functions. Every point 6 € [—7, 71) corresponds to a 
unique point e” on the unit circle S! in the complex place, and, for every function 
fe C(S!,C), there corresponds a function f : 8! > C, where fle’) = f(@) (here 
6 € [—7,7)). The correspondence f @ fis unambiguous because of the condition 
J(—7) = f(z). Therefore the space of 27-periodic functions can also be thought of 
as the space of continuous functions on the unit circle S'. We adopt any of the 
three equivalent characterizations of C(S!,C), as convenience dictates. 


Example 1. For n=0,1,... let u,(t) =e’. The set {u,,}°) separates point in 
[—7, 7]. Thus the set A = a aju; : a; € C} is a subalgebra of C(S',C) that 
separates points in [—7z,7] and contains all constant functions. However, 
A is not dense in @(S',C). We show that for the function f(f)=e", 


\lf—pllo 21 for all peEA. First observe that for any p= Dino Se EA, 


The reason for the notation will be justified in the next paragraph. 
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So" fpat = es qj ee e+ Dtdt = 0. Because |f| = ff=1, 


So” fjat So Wf pa s i [f—pldt <27||f—pll.o. Thus ||f- 
Pllo = 1, as claimed. 


27 = 


The following is the generalization of theorem 4.9.3 to the complex case. 


Theorem 4.9.6 (the Stone-Weierstrass theorem). Let X be a compact metric space 
and let A be a subalgebra of C(X,C) that satisfies SA1 and SA2. If A is closed 
under complex conjugation, then A is dense in C(X,C). 


Proof. Let R = {Re(f) : f€ A}, and let IJ = {Im(f) : fe A}. Iff=fi + if € A, then 
if=—f, + if, € A. Thus f, € R and f, € J. It follows that J = R. First we show 
that R satisfies SA1 and SA2. It is clear that ® contains all constant functions. If 
x and y are distinct points of X, then there exists f € A such that f(x) 4 fly). Thus 
AiG) FAY) or fix) # f(y). Because f, and f, are in R, R separates points in X. 
Theorem B.3 implies that & is dense in C(X,R). Because A is closed under complex 
conjugation, f, = (f+f/2 € A; thus R CA, and hence R+iR CA. We show 
that R + iR is dense in C(X,C). By the density of R in C(X, R), there are functions 
hy, hy € R such that ||f, —hy||,. < €/2 and ||fo — hy||.9 < €/2. The function h = 
hy + ih, isin R+i and ||f—h||,,<¢. 


Example 2. For n € Z, let u,,(t) = e'"’, and consider the set J = Span({u,, : n € 
Z}). J is clearly a subalgebra of C(S',C) that satisfies the assumptions of 
theorem 4.9.6. Therefore J is dense in C(S',C). 


The last example is really a well-known theorem. We will expand this discussion 
in a more focused manner in the next section. 


4.10 Fourier Series and Orthogonal Polynomials 


In section 3.7 we studied the geometry of inner product spaces more than their 
metric properties. We now have a bigger toolbox with which we can tackle 
inner product spaces. Before we pose the central questions of this section, let us 
summarize the highlights of section 3.7, upon which this section rests heavily. Let 
{u,, U2, ...} be an infinite orthonormal sequence of vectors in an inner product 
space H. The orthogonal projection of an element x € H on the finite-dimensional 
space M,, = Span({u,, ...,u,}) is, by definition, the vector S,,x = yu: u;)u;. We 
know from theorem 3.7.6 that the vector S,,x is the closest vector in M,, to x, 
and we also say that S,,x is the best approximation of x in M,,. Now that we have 
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studied convergence in metric spaces, it is natural to ask whether lim, S,x = x. 
Unfortunately, we are still not in a position to state an exact set of conditions under 
which a general answer can be provided because the answer depends on the space 
H and the sequence {u,,u,, ...}. The reader should suspect that completeness is 
relevant here, and it is. The spaces we study in this section are not complete, and this 
is precisely the reason we cannot decisively settle the question posed above about 
the convergence of the sequence S,,x. In two of the major examples we consider in 
this section, we will answer this question satisfactorily but not completely. The full 
picture will materialize in sections 7.2 and 8.9. 


Fourier series 


In section 3.7, we defined the inner product (f,g) = — 5 ae fodg(ddx on the space 
@[—7, 7]. The sequence 


{u,(t) =e" :nEZ} 


is an orthonormal sequence with respect to the above inner product. The norm 
of a function f induced by the inner product will be denoted by ||f|, in order 
to distinguish it from the uniform norm on C@[—z,7], which will also play a 
prominent role in this section. Thus the uniform norm of a function f € C[a, b] 
will be denoted by the usual notation ||f||,,, while 


l= (xf i fx)? 


It is clear that ||fll2 < |[/llo- 


1/2 


In section 4.9, we introduced the space C(S!,C) (which we now abbreviate 
€(S!)) of 27-periodic functions on [—7,7]. It is clear that C(S') is a closed 
subspace of (C[—7, 7], ||-||.0)- 


For a function f € C[—7, 7], we define the Fourier series of f to be the formal 
series 


27 


ys finde where f(n) = =| fe" dt. 


n=—-@w 


The numbers f(1),n € Z are called the Fourier coefficients of f. It is clear that 
the partial sum of the Fourier series, 


n 


Sf) = >) fe 


json 


is the orthogonal projection of fon M,, = Span({u; : -n<i< nt). 
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The question now is whether the sequence S,,f converges to fin the 2-norm. We 
answer the question affirmatively after we establish a few facts. Convergence in the 
2-norm is sometimes called the mean square convergence in order to distinguish 
it from the uniform convergence of S,,f to f, which is also a valid question. 


Definition. A trigonometric polynomial is a linear combination of the functions 
{u, =e : n © Z}. Thus a trigonometric polynomial is a function of the form 


p(t)= >) cel, where c, EC,n EN. 


jean 


The collection J of all trigonometric polynomials is clearly the span of the 
sequence {u,, : n © Z} and is a subspace of C(S!). 


Using the terminology we just established, example 2 in section 4.9 can be stated 
as follows. 


Theorem 4.10.1. The space of trigonometric polynomials is dense in the space 
(€(S'),||-|lco). Explicitly, for every f€ C(S') and every €>0, there exists a 
trigonometric polynomial p such that ||f—p||,. <¢€. 


Lemma 4.10.2. Fora function f € C[—7, 7] (not necessarily periodic), and for every 
€ > 0, there exists a 27-periodic function g such that ||f—g||, < €. 


Proof. Let M = |[fl|.o and define 6 = =. Define g € C(S') as follows: 
Ce) (+7) if —m@<x<-—1 +6, 
g(x) = f(x) if —m™+0<5x<71-6, 


BS e-7) ifm —8<x<n. 


Figure 4.7 below shows how f is modified on the subintervals |[-1,—7 + 6] and 
[7 — 6,7] to produce g. We replace the graph of f on the subinterval |—7,—72 + 6] 
with the straight line that interpolates the points (—z + 6,f(—m + 6)) and(—7,0), 
and similarly on the subinterval [x — 6,7]. The dotted lines in figure 4.7 indicate 
the modification of f to produce g. By construction, g is continuous and periodic. 
Also, for x € [—7, 7], |f(x) — (x)| < 2M. Now 


W-aR= =f R-soraet sf 


—7 7 


—7m+6 1 

4M? [ 7" 42 BMS e? 

<¥ dx+ 5 dx = =—<c*.E 
—1 m-6 


—7m+6 


fc) — g(x) ?'dx 
= 
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Figure 4.7 The modified function g 


Corollary 4.10.3. Trigonometric polynomials are dense in (C[—7,7], ||.||2) 


Proof. Let f€ C[—72,7], and let € > 0. By lemma 4.10.2, there is a function g € 
@(S!) such that ||f—gl||,<¢. By theorem 4.10.1, there exists a trigonometric 
polynomial p such that ||g— p||,. < €. Now 


Ilf— plo < Ilf—sllo + Ilg— lla $ If glo + Ilg — Plloo < 2€. 


Observe that the set of trigonometric polynomials with rational coefficients is 
dense in (€[—z, z], ||.||,); hence (C[—7, z], |].||2) is separable. 


We are now able to settle a question posed in the preamble to this section. 


Theorem 4.10.4. For every function f € C[—7, 7], the sequence of partial sums S,f 
converges in the mean square to f. 


Proof. We need to show that lim, ||f—S,f||, =0. Let € > 0. By corollary 4.10.3, 
there exists a trigonometric polynomial p = Deen 4) such that ||f—p|l.< 
e. For every n>N, p€M, = Span({u; :—n<j<n}). Because S,f is the 
best approximation of f in M,, it follows that, for every n>N, ||f—S,fll2 < 
lf-pll2<e. 
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We take a short detour to discuss the sum of a two-sided sequence. The concept 
framed in the following, more general, definition is sometimes useful. See the 
excursion in section 7.2. 


Definition. Let {az : a € I}be an indexed set of nonnegative numbers, where I is 
an infinite set, possibly uncountable. The sum )){ay : a € Tis, by definition 


Diag a ETS=supty vepdat: 


where the supremum is taken over all finite subsets F C I. 
Example 1. If {ay : a € I} < oo, then the set J= {a EI : ay > 0} is countable. 


Fix a positive integer n. If the set J, ={a EI: ay > 1/n} is infinite, we can 
choose a sequence &,,@>,... of distinct elements of J,,. For the finite subset 
F,={a, ...,0j}, Deer, Ag > j/n. Since j is arbitrary, {ag : a € I} would be 
infinite. This proves that J, is a finite set for every n EN. 

Since (0, 00) = UR, (1/n, 00), J = UP, J,,. Thus J is countable. 


Example 2. Let J be a countable set and suppose that {ay : a € J} < oo, where 


Ag > 0. If &,@,... is any enumeration of J, then )i{ag :@EI}= D0, da, 
For an integer N, yee < Yi{ag :a@€J}. Thus the partial sums of the 
series ys dy, are bounded by Difaq : a € J} and hence pe Gy SD Adah: 
Conversely, if F is an arbitrary finite subset of J, then there is an integer N such 
that F C {a,..., ay}. Therefore {ag : aE F}< ye de XS aie Aq,. Thus 
Diag: @ERS DY ag, 


A special case of the above examples is when (a,,),ez is a two-sided sequence 
of nonnegative numbers.’* The series yo a, can be defined (for example) 
as lim) +00 yeh which corresponds to the following enumeration of Z: 


0,—1,1,—-2,2,—3,3,.... 


We can now define, for 1 < p < oo, the space /?(Z) to be the set of all two-sided 


sequences x = (X,)nez such that ae |Xn|? < oo. It is easy to check that P(Z) is 
a complete normed linear space with the norm (Ss |x, |? ae 

Similarly, we define [°(Z) to be the space of all bounded scalar functions 
x: Z—K, which is a complete normed linear space with the norm ||x||,, = 


sup{|x(n)| : n € Z}. 


More accurately, functions from Z to the base field K. 
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We also define the space cy(Z) as the subspace of I®(Z) of all two-sided sequences 
x €1°(Z) such that limjy}.o0 X, = 0. 


Theorem 4.10.5. For every f € C[—7,7], 


ig = xf Yorae= > oe. 


n=—-@w 


Proof. By the continuity of norms (see section 4.3), lim,||S,f||3 = \|fl3. By the 
Pythagorean theorem, ||S,f\|3 = || Dine Dujll3 7 Wess |G) |?. We obtain the 
required result by taking the limit of both sides as n > oo. 


Example 3. Let f(x) = x’. We compute the Fourier coefficients of f. Since x” sin(nx) 


: . A : 1 pam A 
is an odd function and x” cos(nx) is an even function, — es x sin(nx)dx = 0 
7 


and thus f(n) = f(—n) = =f x? cos(nx)dx. 
Integration by parts now yields 


a 


7 7 
1 —2 2x 
+ { x? cos(nx)dx = = [ x sin(nx)dx = — cos(nx) 
T Jo mn Jo mn 0 


_ 2cosnm _ 2-1)" 


n2 n2 
Thus 
joy =m = 


We also have 


| aS ee 2 2 
5 on: dx = |f(0)| + & i 
T = x* 
ie vs ae 
n=1 


Rearranging the extreme sides of the above string we obtain 


wo 
1 1 
Dane 
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The next result says that a function in C[—7,7] is determined by its Fourier 
coefficients. 


Corollary 4.10.6 (the uniqueness theorem). Iff,g € C[—7, 7] and f(n) = &(n) for 
every n € Z, then f= g. 


Proof. Let h= f—g. By assumption, h(n) = 0 for every n € Z. By theorem 4.10.5, 
7T 


= SO NhG)Pdx =r. |A(n)|? = 0. Since h is continuous, h = 0. 


n=—-@w 


We now turn to the question of uniform convergence of Fourier series, which 
is a more complicated problem than the mean square convergence. Since C(S!) 
is closed in (C[—7, 7], ||.||.o), the Fourier series of a non-periodic function f on 
[—zc,77] cannot converge uniformly to f There are, however, simple criteria that 
guarantee the uniform convergence of the Fourier series of 277-periodic functions. 
The next theorem is a sample. See also example 5 below. 


Theorem 4.10.7. If f € C(S') is such that Y).. |f{(n)| < 00, then the Fourier series 
of f converges uniformly to f. In particular, 


foe) 


fo= >) Ke™ for every x € [-2,7]. 


n=—oo 


Proof. Forn EN, define F,(x) = ¥."___ f({j)e'*. For positive integers m > n, 


j=—n 


m “yn ii m nO 
Fin — Frlloo = SUP xe{—2,2]| DijenrWe™| < D ipa IfZ)| > 0 as n > ov. 


Thus the sequence of functions F,(x) is Cauchy in C(S') and hence converges 
uniformly on [—7,7] to some function F € C(S'). Thus lim, ||F,, — Fl|o = 0. 
But ||F,, — Fll2 < IF, — Ello, and hence F,, converges to F in ||.||2. Since F,, also 
converges to f in ||.||, (theorem 4.10.4), F =f by the uniqueness of limits. 1 


Example 4. This is a continuation of example 3. For the function f(x) =x’, 
Diet \An)| = es = < oo, The Fourier series of x? converges uniformly 
ott ott 
(and absolutely) to x” on [—7, zr]. Since in) = f(—n), 


Dinevne™ = Any(e™ + e-™) = 2" fn)cos(nx). Therefore 
2 = (=1)"cos(nx 
= +4) eA ae 
n 
n=1 


Substituting x = 0 and then x = 7 in this identity, we obtain, respectively, 


co (-1)""! _ Fa oD) 1 _ Fa 
Digg ag 


n=1 n=1 


Example 5. If a function F € C(S') has a continuous derivative, then the Fourier 
series of F converges uniformly to F on [—7, 71]. 
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Let F’ =f If n 4 0, integration by parts yields 


iT 7 
E(n) = ——F(t)e™" 
(x) 27min (We Rae ¥ 27in 


7 
[ fle dt = ~7(n). 
ies in 
Using the inequality |ab| < = (lal? + |b|?], we have 


~ 171 ~ 
<-}— | 
FOI < 5 | + Ro) 
and 
y |Fn)| < = 2y = , IKn)|? | < 00. 
|n|=1 ~ 2 n=1 ne |nf=1 
The result now follows from theorem 4.10.7. 


Orthogonal Polynomials: The General Construction 


Let (a,b) be an interval. A function w : (a,b) >R is said to be a weight 
function if 


(a) w is continuous and strictly positive on (a, b), 
(b) J’ o(x)dx < 00, and 


(c) for every integer n > 0, if. x"w(x)dx < 00. 


Consequently, : p(x)e(x)dx < oo for every polynomial p. Neither the function 
nor the interval (a, b) is assumed to be bounded. 

When either @ or (a,b) is unbounded, we interpret the integrals involved 
as improper Riemann integrals according to the standard definitions. Observe 
that if (a,b) is a bounded interval, then @ can be unbounded if and only if 
lim, ,@(x) = 00 or lim,4, @(x) = oo. See the weight function for the Tchebychev 
polynomials later on in this section. 


Let H be the collection of all continuous functions on (a, b) such that 


b 
- fx)? o(x)dx < 00. 


It is obvious that H is closed under complex conjugation and scalar multiplication. 
The following estimates show that H is a vector space. If f,g € H, then 


b b 
[ [Kx)e(x)|o(x)dx < al [|Ax)|? + |g(x)|? |oo(x)dx < 00 
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and 


b b 
Mi f+ gla(x)dx < ii (lt lel) co(x)dx 
“o 
= i [f[?co(x) + 2|fg|eo(x) + |g|?ao(x)dx < 0. 


We call H the space of continuous, square integrable functions with respect to 
the weight function w. It now makes sense to define the following inner product 
on H: 


(fg) = Si fodg)ao(a)de. 

The Gram-Schmidt process can be applied to the sequence of independent func- 
tions {1,x,x”, ...} to yield a sequence of orthogonal polynomials ¢o,¢,, .... The 
orthogonal polynomials in this general construction (regardless of the weight 
function or the interval (a,b)) share broad characteristics, which we will not 
discuss further. See the section exercises for some of the general features of 
orthogonal polynomials. In the remainder of this section, we give three major 
examples of orthogonal polynomials. 


The Legendre Polynomials 


In this special case, we take 


(a,b) =(-1,1) 


and 


w(x) =1. 
Observe that the space H of continuous, square integrable functions on (—1,1) 
contains the entire space C[—1,1]. The resulting orthogonal polynomials are 
the well-known Legendre polynomials. In section 3.7, we derived the following 
formula for the Legendre polynomials (up to a multiplicative constant): 


Q,(x) = D"(x? — 1)". 


The first two Legendre polynomials are obvious: Qo(x) = 1, and Q,(x) = 2x. We 
establish the properties of the Legendre polynomials in a number of steps: 


1. The parity of the Legendre polynomials. Since the binomial expansion of 
(x? — 1)" contains only even powers of x, Q,, is an even polynomial if n is 
even, and conversely. 

2. The normalized Legendre polynomials P,,. It is customary to normalize the 
Legendre polynomials so that they take the value 1 at x = 1. Thus P,, = c,,Q,, 
and we find c,, by imposing the condition P,,(1) = 1. Using the Leibnitz rule, 
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Q,(x) = D"(x? — 1)" =D" [2nx(x? — 1)"""] 
= 2nxD""!(x? —1)""1 + (n—1)2nD" (x? — 1)"71. 


Evaluating the last identity at x = 1, we obtain Q,,(1) = 2nQ,_,(1) and, by 
induction, Q,,(1) = 2”n!. Therefore the polynomials 


P(x) = 3n ee iy, 1? 


are orthogonal and satisfy the normalization condition P,,(1) = 1. 


. The leading coefficient of P,,. We use the symbol @,, to indicate the leading 


coefficient of P,. The leading coefficient in D"(x*—1)” is the result of 
differentiating x°" exactly n times. Therefore 


— 
(nl)? 


a, = sq 2n)Q2n ~1)...n+)= 


. The three-term recurrence relation. The recurrence relation we derive 


below facilitates the generation of the sequence P,,. We will use the brief 


ee 
notation 8, = = =“. The polynomial P,,.,— 
og 


n 


most n. Therefore, 


Pri — BnxP, = De i ciPir 


where 


Cj ||P; 3 = (Piaa= BaxP ns P= = =P xP Pi) 


Ifj<n-1, ||P ill = —B,(xP,,,P;) = —B,(Pn,xP;) = 0, since xP; has degree 
less than n. Thus P,,4., — 6, xP, = ¢,P, + C,—;P,—-;. Now 
1 
elu =—BubsPu Py) =—By f xPHODde = 0 


-1 


since xP?(x) is an odd function. Therefore 


Pri BaxP,, me = ¢,P,_ 1: 


Evaluating the last identity at x = 1, we obtain c, =1—6, = ma , and we 
n 
have the recurrence relation: 
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2n+1 
agra a n 


a Pp 
ml ntl" ntl 


n—1- 
Here is a list of the first Legendre polynomials: 


Po(x) = 1, 
P(x) =x, 


P(x) = 53x" —D) 
P,(x) = 5 (6° — 3), 


1 
P,(x) = rica — 30x” + 3). 


5. The norm of P,,. Next we show that 


: 2 2 
[ 0 dx = Tre 


Leta, = is 1 lee (x)] dx. Taking the inner product of P,, with both sides of the 
ot xP,_;- —P,,_», we obtain a, = et P,P, 1d. 
Using the aecurrence relation again, xP, = [(n + DPiay + nP,,— 1 /(2n+1), 
and hence 


identity P,, = 


1 


2n—1 
‘| P,_4[(n ar IP ea + nP,-1| = 
-l 


= ——a4,_}. 
2n+1 "1 


_2n-1 1 
~ nm 2n+1 


ay, 


Now dy = ifr dx = 2. By induction, one obtains 


It follows that the polynomials below are orthonormal in (C[—1, 1], ||.||,): 


a! 2n+1 
P,=4/ 5 Pi. 


Theorem 4.10.8 (mean square convergence). For every f € C[—1, 1], the sequence 
Sf= par P.)P, converges to f in the sense that lim, ||S,f—fll2 = 0. 


Proof. Let € > 0. By the Weierstrass approximation theorem, there exists a polyno- 
mial q such that ||f— q|loo < e/1/2. Now \lf- lle < V2\|f—- loo < €. Let N be the 
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degree of q. For everyn>N,q €P,, and since S,f is the best approximation of f 
in Ps |If—Spflle < ||f— allo < €, as required. Ml 


Observe the resemblance between the proof of the last theorem and that of 
theorem 4.10.4. See also the examples in section 6.1. 


The Tchebychev Polynomials 


In this special case, we take 
(a,b) = (-1,1) 


and 
1 


o(x) = ; 
1—x? 

Observe that the space H of square integrable functions with respect to w contains 
the entire space C[—1, 1]. 


A simple and direct derivation of the orthogonal polynomials is possible because of 
the observation that, for an integer n > 0, cos(nx) can be expressed as a polynomial 
of cosx. For example, cos(2x) = 2cos*x — 1. The next lemma proves the existence of 
such polynomials and establishes the three-term recurrence relation among them. 


Lemma 4.10.9. For n > 0, there exists a polynomial T,, of exact degree n such that, 
for all x € R, cos(nx) = T,,(cos x). 


Proof. For n=0,1, the polynomials T)(x) = 1 and T,(x) =x trivially satisfy the 
requirements. The rest of the construction is inductive. Suppose that there are 
polynomials Ty, ...,T,, that satisfy the statement we wish to prove. For n> 1, we 
have cos(n + 1)x + cos(n — 1)x = 2cos(nx)cos x. Therefore 


cos(n + 1)x = 2cos(nx)cos x — cos(n — 1)x = 2cos xT,,(cos nx) — T,,_,(cos x). 


‘The last identity dictates the definition of T,,,, and concludes the proof: 


T,,4.1(%) = 2xT,,(x) pas Ete): a 
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Definition. The polynomials T,, in the previous lemma are called the Tchebychev 
polynomials. A list of the next three Tchebychev polynomials appears below: 


T(x) = 2x? —1, 
T3(x) = 4x? — 3x, 
T,(x) = 8x4 — 8x? +1. 


Theorem 4.10.10. The Tchebychev polynomials are orthogonal with respect to the 
weight function w. Additionally, 


I|Toll3 = 7 


and, forn> 1, 


1 
TB = 5. 
Proof. We use the change of variable x = cos @. If m # n, then 


(IysTn) = * TIT nl) 4, | cost) céstmnO)de 
0 


V1-x? 


ys 
= ; | cos(m + n)@ + cos(m — n)OdO = 0. 
0 


Finally, ||T,||3 = |e. oe OT die = fy" cos*(n8)d8 = + "1+ cos(2n8)d0 = 
7/2. 0 


The basic properties of the Tchebychev polynomials appear below. The first three 
follow from the three-term recurrence relation and induction: 


. T,, is even if and only if n is even. 

. The leading term of T,, is 2"7’. 

. T, (1) = 1, and T,,(-1) = (-1)”. 

. For all n>0, ||T,|lo = max{|T,()| : -l1<x<1}=1. For every x€ 
[—1, 1], there is anumber 9 such that x = cos 6. Thus |T,,(x)| = |cos(n@)| < 1. 
Since T,,(1) = 1, it follows that ||T,,||,o = 1. 

5. The roots of T,, are x, = epee 1<k<n. This can be verified directly, 

or one can write x = cos 0. If T.(x) = 0, then cos(n@) = 0. So nO is an odd 

multiple of 7/2; hence the stated values of x, are all the roots of T,,. 


Pw NHN 
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; ; ; k 
6. The extreme values of T,, in [—1,1] are attained at the points y, = cos, 
n 


0 <k<n. Additionally, T,,(y;,) = (—1)*. Again a direct verification is the 
aT, (x) _ 


x 


simplest or, as before, we write x = cos 6, then T,,(x) = cos(n@), and 
d cos(n9) dO __ n sin(n®) 


dO dx V1-x2 


arrive at the points y,. 


. The interested reader can work out the calculus and 


For n>1, let T,, = Ty. From the above properties of T,,, T,, is a monic 


polynomial,"* and 


T (x) =0Oforl<k<n, 


a, \__ (—1* f A 
TO) = Fal or0<k<n, 
~ 1 

and |[Tylles = 57 


The following theorem establishes the curious fact that, among all monic poly- 
nomials of degree n, ae has the least uniform norm on [—1,1]. This result is 
important for understanding the error when a sufficiently differentiable function 
is interpolated by a polynomial. 


Theorem 4.10.11. Suppose p is a monic polynomial of degree n. Then 


[Pllc = max{|p(x)| : -l<x<Y>—. 


Qn 


Pe, 1 , ; 
Proof. Suppose, for a contradiction, that |\p||..< a Consider the integers 
O<k<n. 


If k is odd, then p(y,) > ~ = Ty). If k is even, then p(y) < 


Qn-1 i 


1 
Qn-1 


= Tn): 
Thus the polynomial q = p — T,, alternates sign at the points yo, ...,y,3 hence q has 
a root in each of the n open intervals (yy, V1), --+5 Vyn—1,¥,)- This is a contradiction 
because q has degree at mostn—1.™ 
The Hermite Polynomials 
For our last example of orthogonal polynomials, we take 


(a, b) = (—00, 00) 


* A monic polynomial is one whose leading coefficient is 1. 
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and 


w(x) = e-* 


We will show that the polynomials defined below are orthogonal with respect 
to w: 
H,,(x) = (—1)"e" D"[e~* J. 
Since Hy = 1, and H,(x) = 2x, (Hy, H;) = fe. 2xe~* dx = 0. We now use induc- 
tion on n. If 0 <j <n, then integration by parts yields 


foe} 


(.H,) = | w—pte® Dt Je" de = (0 f xiD"[e™ |dx 


wo 

=(-1)"WJD"1e-"| — —(-1)" ixi-1 "1 e-* dx, 
J 
wo =o 


The first term of the last expression is 0 because x/D"~!e* is the product of a 
polynomial and e~*, and the second term is 0 by the inductive hypothesis. 


We leave some of the properties of the Hermite polynomials as exercises for the 
interested reader. 
Exercises 


1. Let fg € C[—7, 7]. Prove that = J” fedg(adx = YAM a(n). Hint: 
Use theorem 4.10.4 and we continuity of inner products. See section 4.3. 


2. Show that |x|=—-—-- me i —_ an <x<z. Conclude that 
wo (-1)""! a= a 
Linz 1 Qn— ve & 
"+1 sin(nx 
3. Show that ~~~ = = ; Cae 


n> 
4. Use the previous problem to show that ))~_, ae 
5. This exercise furnishes the three-term recurrence relation for general 
orthogonal polynomials with respect to a weight function @ on an interval 
(a,b), and the inner product (f,g) = So fodgcoo(x)dx. Let $9,¢1,... be 
the orthogonal monic polynomials with respect to the weight function 
w, where $y = 1, and ¢,, has degree n.’° Prove the three term recurrence 
relation below: 


76 


*° Observe that these are precisely the orthogonal polynomials generated by applying the Gram- 
Schmidt process to the monomials 1, x, x”, .... 
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10. 


11. 


12. 


13. 
14. 
15. 
16. 
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Pn41 2) = (= Ang bn(X) — Bi Pn 


where 
bbw dn) 
me Weal? 

and 
lPall 
bia = ; 
lena 


Here n > 0, and, for notational convenience, define $_,; = 0, b, = 0. 


. In the notation of the previous exercise, prove that the roots of ¢,, are the 


eigenvalues of the tri-diagonal matrix 


a, b, 


. Prove that all the roots of ¢,, are real and simple and lie in the interval (a, b). 


Outline: Since (¢o,¢,,) = 0, ie $,wdx = 0. Thus ¢,, changes sign in (a,b), 
and hence it has at least one root of odd multiplicity. Let x,,...,x, be the 
roots of $, of odd multiplicity in (a,b), and let q = (x—x,)...(x—x,). If 
r<n, examine (q,¢,). 


. Prove that the Legendre polynomial P,, satisfies the differential equation 


(x? —1)P! + 2xP!,—n(n+1)P,, =0. 


. Prove that the sum of the coefficients of any Legendre polynomial is 1. The 


same is true for the Tchebychev polynomials. 
Prove that C[—1,1] is contained in the space of continuous square inte- 
grable functions on (—1,1) with respect to w(x) = The integrals 


2 


‘l- 


involved are improper Riemann integrals. 
Define the normalized Tchebychev polynomials T= ein =,/ 2p 
7 7 


For a function feEC[—1,1] let S,f= Dash T;)T;. Prove that 
lim, ||S,f—flle = 0. 

Prove that H,,., = 2xH,, — 2nH,,_,. Conclude that H,, is even if and only if 
n is even. 

Prove that Hj, = 2xH,, — H,4,. Conclude that Hj, = 2nH,,_. 

Compute H, and H3. 

Show that ||H,,||3 = J. [H,(x)] e-* dx = n!2"/z. 

Prove that the Hermite polynomial H,, satisfies the differential equation 
Hi, (x) — 2xH;,(x) + 2nH,,(x) = 0. 


fe) 
Essentials of General Topology 


Considering that he only had three years to devote to topology, he made his mark 
in his chosen field with brilliance and passion. He transformed the subject into 
a rich domain of modern mathematics. How much more might there have been, 
had he not died so young?" 

Crilly and Johnson wrote of Pavel Urysohn 


~< i 


Pavel Urysohn. 1898-1924 


In 1915 Urysohn entered the University of Moscow to study physics. However, his 
interest in physics soon took second place, for, after attending lectures by Luzin 
and Egoroff, he began to concentrate on mathematics. Urysohn graduated in 1919 
and continued working toward his doctorate. In June 1921, he became an assistant 
professor at the University of Moscow. 

Urysohn soon turned to topology. Egoroff gave him two problems in 1921. 
These were difficult problems that had been around for some time. Egoroff was 
not to be disappointed. Near the end of August, even before working out the details, 
Urysohn had the correct ideas for solving the problems. During the following year, 
Urysohn worked through the details, building a whole new area of dimension 
theory in topology. It was an exciting time for topologists in Moscow, for Urysohn 
lectured on the topology of continua, and often his latest results were presented 
in the course shortly after he had proved them. He published a series of short 
notes on this topic during 1922. The complete theory was presented in an article 


’ T. Crilly and D. Johnson, “The emergence of topological dimension theory,’ in I. M. James (ed.), 
History of Topology (New York: Elsevier, 1999), 1-24. 


Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules. 
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that Lebesgue accepted for publication in the Comptes rendus of the Academy of 
Sciences in Paris. This gave Urysohn an international platform for his ideas, which 
immediately attracted the interest of mathematicians such as Hilbert, Hausdorff, 
and Brouwer. In addition to advancing dimension theory, Urysohn is credited for 
an important metrization theorem. He is particularly remembered for “Urysohn’s 
lemma, which establishes the existence of a continuous function taking the values 
0 and 1 on disjoint closed subsets of a normal space. 


Urysohn published a full version of his dimension theory in Fundamenta 
Mathematicae. He wrote a major paper in two parts in 1923, but they did not 
appear in print until 1925 and 1926. Sadly, Urysohn died in a drowning accident 
before even the first part was published. His untimely death generated much 
sadness in the mathematical community. 


In the summer of 1924, Urysohn set off with Alexandroff on a European trip 
through Germany, Holland, and France. The two mathematicians visited Hilbert. 
After they left, Hilbert wrote to Urysohn, informing him that his paper with 
Alexandroff had been accepted for publication in Mathematische Annalen, and 
expressing the hope that Urysohn would visit again the following summer. They 
then met Hausdorff, who was impressed with Urysohn’s results. He also wrote a 
letter to Urysohn, which was dated August 11, 1924. The letter discusses Urysohn’s 
metrization theorem and his construction of a universal separable metric space 
(one into which any separable metric space can be injected), which was one of 
Urysohn’s last results. Like Hilbert, Hausdorff expressed the hope that Urysohn 
would visit again the following summer. Van Dalen writes about their final 
mathematical visit, which was to Brouwer:? “This time [Urysohn and Alexandroff] 
visited Brouwer, who was most favourably impressed by the two Russians. He 
was particularly taken with Urysohn, for whom he developed something like the 
attachment to a lost son.” 


5.1 Definitions and Basic Properties 


While the metric topology is often sufficient for most introductory courses in 
analysis, a good understanding of the elements of general topology is essential for 
any advanced study of analysis. An attempt to define topology in a paragraph is 
quite difficult and not likely to be successful, but we offer the following narrative 


2 J. J. O'Connor and E. F Robertson, “Pavel Samuilovich Urysohn, in MacTutor History 
of Mathematics, (St Andrews: University of St Andrews, 1998), http://mathshistory.st-andrews. 
ac.uk/Biographies/Urysohn/, accessed Oct. 31, 2020. 
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for the satisfaction of the the reader who insists on an overview of the subject. We 
saw in chapter 4 that the collection of open sets generated by a metric has many 
intrinsic properties independent of the defining metric. In this section, we study 
the arrangement of the collection of open sets, or the topology, in a metric-free 
context. Every metric space is a topological space; hence all results for topological 
spaces (which are meaningful in the metric setting) are also valid for metric spaces, 
but not conversely. We often fall back on the metric case to gain insight into both 
subjects. We will encounter in this section many of the definitions that appeared 
in chapter 4, such as closure, interior, and boundary. We include those definitions 
again in this chapter for ease of reference. However, the proofs that duplicate 
those in chapter 4 are omitted. The amount of duplication is small and does 
not rise to the level of redundancy. We encourage the reader to compare results in 
this section to their counterparts in the previous chapter. The exercise is insightful. 


Let X be a nonempty set, and let J be a collection of subsets of X; J is called a 
topology on X if 


(a) @and Xarein TJ, 
(b) the union of an arbitrary family of members of J is a member of J, and 
(c) the intersection of two members of J is a member of J. 


Thus J is closed under the formation of arbitrary unions and finite intersections. 
The members of J are called the open subsets of X, and the pair (X,7) is calleda 
topological space. 


Example 1. Let X be a nonempty set, and let J = P(X). Clearly, (X,7) is a 
topological space. In fact, P(X) is the largest topology one can define on X. In 
this topology, every subset of X is open. This topology is known as the discrete 
topology on X. It is clear that the discrete topology is too large to be useful. @ 


Example 2. Let X be a nonempty set, and let J = {@, X}. This topology is called 
the trivial or indiscrete topology on X. ¢ 


Example 3. Let X be an infinite set, and define a subset U of X to be open if U = @ 
or if X— Uis finite. We verify that the collection of open sets we just defined is 
a topology. If {Uz }_ is a collection of open sets, then, for each a, Fy = X— Ug 
is finite. Now Ug U, is open because X — Uy Ug, = NaF, which is finite. It is 
easy to verify that the intersection of two open subsets is open. This topology is 
called the co-finite topology on X (or the finite complement topology.) @ 


Example 4. Let X = (0, co), and let J consist of @ and all intervals of the form 
(a, 00), for all a > 0. It is easy to verify that J is a topology. @ 
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Example 5. The most common topologies are the metric topologies. Thus every 
metric space is a topological space in accordance with the following definition: a 
subset U of a metric space (X, d) is open if it is the union of open balls. Theorem 
4.1.2 says precisely that the collection of open sets thus defined is a topology. 
The reader can look up sections 3.6, 3.7, 4.1, and 4.8 for a variety of examples of 
metric spaces and hence topological spaces. 


Example 6. The most important topological space is R”, where the topology is the 
metric topology generated by the Euclidean metric (or any equivalent metric.) 
We will call this the usual topology on R”. @ 


If the topology J on a topological space X is understood, we simply say that X is 
a topological space and omit the reference to J. If more than one topology on X 
is being considered or if there is a danger of ambiguity, we will specifically state 
which topology applies to the situation in hand. 


Definition. Let (X,7) be a topological space. A subset F of X is said to be closed 
if its complement is open. 


Theorem 5.1.1. Let X be a topological space. Then 


(a) X and @ are closed, 
(b) the union of finitely many closed sets is closed, and 
(c) the intersection of an arbitrary collection of closed sets is closed. 


Definition. Let A be a subset of a topological space X. The interior of A, denoted 
int(A), is the union of all the open sets contained in A. A point of int(A) is called 
an interior point of A. The closure of A, denoted A, is the intersection of all the 
closed sets containing A. A point of A is called a closure point of A. 


The following properties of interiors and closures are straightforward. See the 
corresponding results in chapter 4. 


Theorem 5.1.2. Let A and B be subsets of a topological space X. Then 


(a) int(A) is the largest open subset of A; 

(b) A is open if and only if int(A) = A; 

(c) if A CB, then int(A) C int(B); 

(d) A is the smalled closed subset of X containing A; 
(e) A is closed if and only if A = A; and 

(f) if ACB, then ACB. 
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Definition. Let X be a topological space, and let x € X. A neighborhood of x is a 

subset A of X that contains an open set that contains x. A neighborhood of a set 
EC X isa subset A of X that contains an open set that contains E. 

Observe that the definition does not require a neighborhood of a point or a set 

to be open. If A is open, we specifically refer to it as an open neighborhood of 


x (respectively, E). For example, an open set is a neighborhood of each of its points. 


The following theorem provides a useful criterion for characterizing the closure of 
a set. Compare its statement and proof to those of theorem 4.2.4. 


Theorem 5.1.3. Let A be a subset of a topological space X. Then x € A if and only if 
every open neighborhood of x intersects A. 


Proof. Suppose x € A. Then x is in the open set U= X— A, and, clearly, UNA = ©. 
Conversely, if U is an open neighborhood of x that does not intersect A, then A is 
contained in the closed set F = X — U. Therefore A C F. In particular, x ¢ A. @ 


Theorem 5.1.4. Let A be a subspace of a topological space (X,J). Then 


(a) int(A) = X —(X — A), and 
(b) A= X-—int(X—A). 


Proof. The proof is as follows: 


int(A) =U{U: VET,UCA$=X-[X-UfU: VET,UC A} 
=X-N{x-U:UET,UCAS=X—N{F: (X—A) CF, F closed} 
=X—-X-A. 


To prove part (b), let B= X — A. Then, by part (a), 
X—int(B) =X—[X—(X-—B)|=X-—B=A.0 


Definition. Let.A bea subset of a topological space X. The boundary of A, denoted 
OA, is the set A M X—A. A point of 0A is called a boundary point of A. 


By theorem 5.1.3, a point x is a boundary point of A if and only if every open 
neighborhood of x intersects A and its complement. 


The proofs of the the following statements strongly resemble their metric counter- 
parts. See, for example, theorems 4.2.8 and 4.2.9. 
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Theorem 5.1.5. Let A be a subset of a topological space X. Then 


(a) int(A)NdA = @, 
(b) A = int(A) UGA, 
(c) A=AUQA, and 
(d) A is closed if and only ifoA C A. 


Subspace Topology 


Let (X, 7) be a topological space, and let Y be a subset of X. Define ={YNU: 
U€T}. It is easy to verify that J, is a topology on Y. For example, if {Y N Ugsa 
is a collection of members of Jy, then Ug(Y N Ug) = Y N (Ug Ug), which is in 
Ty because Uz, Uy € J. Verifying that Fy is closed under the formation of finite 
intersections is straightforward. 


Definition. The topology J; is known as the relative, subspace, or restricted 
topology on Y induced by the topology J. 


Theorem 5.1.6. Let A CY, and let Ay denote the closure of A in (Y,Jy). Then 


Proof. Since A is closed in X, ANY is closed in Y. Since A C ANY, Ay CANY. We 
prove the reverse containment. Since Ay is closed in Y, there exists a closed subset 
F of X such that Ay = FAY. Thus F is a closed subset of X, and AC F. Hence 
AC Fand ANYCFONY=Ay.@ 


Exercises 


1. Let J ={[n, 00) : n € Z}. IsT a topology on R? 

2. Let X be an infinite set, and let J be the collection of subsets U of X such 
that U = @ or X— Uis countable. Is J a topology? 

. Define J to be the following collection of subsets of R: U € J if Uis empty 
or if 0 € U. Verify that J is a topology. Prove that {0} = X and that the 
restriction of J to R — {0} is the discrete topology. 

4. Let (X, 7) be a topological space, and let w be an object not in X. Define a 
collection F of subsets of Y = X U {w} as follows: a subset A of Y is in F if 
A=@orifA={w}UU, where UCT. Is F a topology on Y? 

. Prove that the intersection of an arbitrary collection of topologies on a set 
X is a topology on X. 


1S) 


on 
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6. (a) Prove that if A and B are subsets of a topological space X, then A UB = 
AUB. 
(b) Let {Ag}, be an arbitrary collection of subsets of X. Prove that Ce ee 
UgAg and give an example to show that strict inclusion is possible. 

7. Let Y be a subspace of a topological space X. Show that a subset A of Y 
is closed in Y if and only if there exists a closed subset F of X such that 
A=FNY. 

8. Let U be an open subset of X, and let A C X. Show that if UN A # @, then 
UNAF@. 


Definition. A point x is said to be a limit point of a subset A of a 
topological space X if every open neighborhood of x intersects A at a point 
other than x. The set of limit points of A is denoted by A’. 


9. (a) Prove that x is a limit point of A if and only if x € A — {x}. 
(b) Prove that theorem 4.2.7 is valid for a general topological space. 
10. Let A and B be subsets of a topological space X. Which of the following is 
true? 


(a) (AUB) = A’ UB’ 
(b) (ANB) = A'NB’ 


Definition. A subset A ofa topological space X is said to be nowhere dense 
in X if int(A) = ©. 


11. Show that the results of problems 9 and 10 on section 4.6 are valid for a 
general topological space. 


5.2 Bases and Subbases 


Some topologies are quite difficult to define directly, and it is frequently the case 
that we want to define a topology on a set X that includes a certain collection 
S of subsets of X. The existence of such a topology is obvious because P(X) is 
such a topology. However, P(X) is useless because it is too large. This immediately 
suggests the question of finding the smallest topology J on X that contains ©. 
Fortunately, such a unique smallest topology J exists. 

The reader may wonder what situations would compel us to “want” the mem- 
bers of S to be open. The prime such situation is when we need a certain class 
of functions from X to another topological space Y to be continuous, which is 
the overarching idea behind the definition of product and weak topologies. See 
sections 5.4 and 6.7. 
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The set © in the above discussion is called a subbase for J, and a closely connected 
concept is that of a base for the topology J, which is our first definition. Bases and 
subbases have a wide range of applications. In addition to providing the means 
to define useful topologies, bases and subbases give us easy ways to prove the 
continuity of functions and to characterize closures. See theorems 5.2.2 and 5.3.1. 


Definition. An open base for a topology J ona set X is a collection 8 of open 
subsets of X such that every nonempty open subset in X is the union of members 
of B. If B is an open base for J, we say that B generates 7. 


See problem 2 at the end of this section for an equivalent, more explicit 
formulation of the definition of an open base. 


Example 1. The collection 8 = {(r,s) : r,s € Q,r < s}isan open base for the usual 
topology on R. This is because every open subset of R is the union of open 
bounded intervals, and any such interval is the union of members of B: (a, b) = 
Ui@,s): re Q,s €Q,a<1r<s< D}. See section 4.5 for a more general version 
of this example. 


The collection of open balls in a metric space is an open base for the metric 
topology. This follows immediately from the definition of open sets in a metric 
space. 


Caution: Not every collection © of subsets of X such that U{U : UE ©} = X is the 
open base for some topology on X, as the next example illustrates. 


Example 2. Let X = {a,b,c}, and let © = {@,X, {a, b}, {b, ch}. The collection © is 
not the base for any topology on X because if it were, that topology would be € 
because the union of two members of © is in ©. However, © is not a topology, 
because {a, b} N{b, ck EC. 


Theorem 5.2.1. Let X be a nonempty set, and let B be a collection of subsets of X 
such that ULU : UE 8B} =X. Then B is an open base for some topology on X 
if and only if, for every U,V € B, and every x E UNV, there exists a member 
W € B such thatx Ee WC UNV. 


Proof. If 8 is an open base for some topology J, and x,U, and V are as in the 
statement of the theorem, then UN V is a nonempty open set. By the definition 
of an open base, there is member W of 8 such thatx € WC UNV. 


Conversely, suppose B satisfies the assumptions of the theorem. Define a family 
of subsets J of X as follows: UE TF if and only if U is the union of members of 
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B. We claim that J is a topology. Suppose {Ug}q is a collection of members of 
TJ, and let x € U=U_Uy. Then x € Ug for some a. By the very definition of 7, 
there is a member B of 8 such that x € B C Ug CU. This makes U the union of 
members of 8, that is, UE J. Now consider two members U, and U, of J, and 
let x € U; NU). By the definition of J, there are members B, and B, of B such 
that x € B, CU, and x € B, C U;. By assumption, there is a member W of B 
such that x € WC B, NB). Thus x € WCU, NU, and U,NU, ET. We have 
proved that J is a topology. It is clear that B is an open base for T. 


Example 3. Let X = R, and let 8 be the collection of intervals in R of the form 
[a,b), where a,b € R and a < b. The nonempty intersection of two members 
of B is a member of B. By theorem 5.2.1, B is the base for a topology on R 
called the lower limit topology. The real line with the lower limit topology is 
sometimes referred to as the Sorgenfrey line and is denoted by R;. The lower 
limit topology is a rich and complicated topology that provides a number of 
interesting counterexamples. See problem 3 at the end of this section, and the 
exercises on section 5.7. 


The following theorem serves as an early indicator of the importance and typical 
uses of open bases. 


Theorem 5.2.2. Let X be a topological space, and let 8 be an open base for the 
topology on X. If A be a subset of X, and x € X, then x € A if and only if every 
basis element containing x intersects A. 


Proof. Use theorem 5.1.3 and problem 2 at the end of this section. 


Definition. A topology J ona set X is said to be weaker (or smaller, or coarser) 
than a topology F¥ on X if J C F. We also say that F is stronger (or finer) 
than J. 


Example 4. If X is an infinite set, then the indiscrete topology on X is weaker 
than the co-finite topology, which, in turn, is weaker than the topology P(X). 
The lower limit topology is strictly stronger than the usual topology on R. See 
problem 3 at the end of this section. 


Definition. An open subbase for the topology J on X is a collection @ of open 
sets such that the collection of finite intersections of members of © is an open 
base for J. If G is a subbase for J, we say that G generates J. 


Example 5. The collection of intervals {(—oo, b) : bE Q}U{(a, 00) : aE Oh is an 
open subbase for the usual topology on R. 
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The following theorem provides an important mechanism for constructing a 
topology that contains a predetermined collection of subsets, as described in the 
preamble to the section. 


Theorem 5.2.3. Let S be a collection of subsets of a nonempty set X such that U{S : 
S € ©} =X. Then there exists a unique smallest topology on X that contains © as 
a subbase. 


Proof. Let 8 be the collection of finite intersections of members of ©. If U and V are 
in B, then clearly UN V is in B. By theorem 5.2.1, B is the base of a topology, J. 
Notice that the members of J are unions of finite intersections of members of ©. If 
F is another topology that contains ©, then F, being a topology, contains all finite 
intersections of S and hence all unions of such intersections. Thus F contains J. 
This makes J the weakest topology that contains ©. 


Exercises 


1. Show that the collection of open boxes {TL _,G: b;) : a;,b; € Q} is an open 
base for the usual topology on R”. 

2. Prove that a collection 8 of open subsets of a topological space X is an 
open base if and only if, for every open set U and every x € U, there exists a 
member B of 8 such that x € BC U. 

3. (a) Prove that the usual topology on R is weaker that the lower limit 

topology. 

(b) Prove that each of the following intervals is both open and closed in 
the lower limit topology: [a, b), (—0o,a), and [a, oo). Conclude that the 
usual topology is strictly weaker than the lower limit topology. 

4. Let B, and B, be bases for the topologies J; and J; on the same set X. Show 
that if, for every B € B, and every x € B, there exists an element B’ € B, 
such that x € B’ C B, then J CF. 

5. Let 8 be an open base for a topology J on a set X. Prove that J is the 
intersection of all the topologies on X that contain B. 

6. What topology on R is generated by the open subbase {(—co, a) : aE R}? 

7. Let {%}q bea collection of topologies on a set X. Prove that there is a unique 
smallest topology J that contains Ug Jy. 


5.3 Continuity 


In section 4.3, we studied the definition of local continuity of functions on metric 
spaces. It is clear that the ¢-6 definition provides no clues to generalizing the 
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definition to the topological case. However, theorem 4.3.1 provides a metric-free 
characterization of local continuity which, with very slight changes, produces the 
following definition. 


Definition. Let X and Y be topological spaces. A function f : X > Y is said to be 
continuous at a point x) € X if, for every open subset V of Y containing f(x), 
f-'(V) contains an open neighborhood of xp. 


We point out here an important distinction between metric and general 
topologies. Theorem 4.3.2 established the fact that, in the metric case, continuity 
is equivalent to sequential continuity. This is not the case for a general 
topological space. See problem 11 at the end of this section. 


As in the metric case, we can define a function from a topological space X to 
another space Y to be continuous if it is continuous at each point of X. However, 
theorem 4.3.3 suggests a more convenient, and widely used, definition of global 
continuity. 


Definition. Let (X,%) and (Y, Jy) be topological spaces. A function f : X > Y is 
said to be continuous if the inverse image of every open subset of Y is an open 
subset of X. Symbolically, V € Jy implies f-!(V) € Jy. 


Continuity depends entirely on the topologies on X and Y. Let X=R, F be 
the discrete topology on_X, and let J, be the usual topology on R. The identity 
function Iy : (X,%) > (X,%H) is continuous, but the very same function I, : 
(X, 5) > (X, 7) is not continuous because not every subset of R is open in the 
usual topology of R. 


Theorem 5.3.1. Using the notation of the above definition, the following are 
equivalent: 


(a) fis continuous. 

(b) The inverse image of a closed subset of Y is a closed subset of X. 

(c) If B is an open base for Jy, then f-'(B) is open in X for every B € B. 

(d) If S is an open subbase for Jy, then, for every S € G, f—'(S) is open in X. 


Proof. Parts (a) and (b) are equivalent because of the identity f—'\(F) = X—f7! 
(Y— F) and the fact that a subset F of Y is closed if and only if Y — F is open. 
Clearly, (a) implies (c), and (c) implies (d). Now (d) implies (c) by virtue of the 
identity f—'(S)N... NS,) =f '(S,))N... Af-1(S,,), and(c) implies (a) because of 
the identity f-'\(UgBa) = Ug (By). i 
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Example 1. Suppose f is a real-valued function on a topological space X. If f is 
continuous at xo, then there exists an open neighborhood U of xy such that fly 
is a bounded subset of R. 

Consider the open interval V= (f(x9) — 1,f(xo) + 1). By the definition of 
continuity, there exists an open neighborhood U of x, such that f(U) C V. 
Equivalently, for every x € U, | f(x) —f(xo)| < 1. Now, for every x € U, |f(x)| < 
Le) — f%0)| + Lfler)] < 1 + Lfl%p)] < 00. # 


We leave the proof of the following theorem as an exercise. 


Theorem 5.3.2. A function f from a topological space X to a topological space Y is 
continuous if and only if it is continuous at each point x € X. @ 


Definition. A real-valued function f on a topological space X is said to be locally 
bounded if, for every x € X, there exists an open neighborhood U,, of x such 
that f|y, is bounded in R. 


The following example follows directly from example 1 and the previous theorem. 


Example 2. A continuous, real-valued function on a topological space is locally 
bounded. 


Two Important Function Spaces 


In section 4.8, we defined the spaces B(X) of bounded functions on an arbitrary 
nonempty set X and, for a metric space X, the spaces C(X) of continuous 
functions on X and BC(X) of continuous bounded functions on X. The same 
definitions clearly make good sense when X is a topological space. Theorem 4.8.1 
states that B(X) is a complete normed linear space under the uniform metric. The 
following theorem is the generalization of theorem 4.8.2. 


Theorem 5.3.3. If X is a topological space, then the space BC(X) of continuous 
bounded functions on X is a complete normed linear space. 


Proof. Since BC(X) is a subspace of B(X), it suffices to show that BC(X) is closed 
in B(X). Let f € B(X) be a closure point of BC(X). We need to show that f is 
continuous at each point x» € X. For € > 0, there exists a function g € BC(X) 
such that ||f— glo <€/3. By the continuity of g at xo, there exists an open 
neighborhood U of x9 such that, for every x € U, |g(x) — g(%p)| < €/3. Now if 
x € U, then | f(x) —f(xo)| < If) — g@)| + lg) — g(%)| + le (Xo) — f(x) < €. 
a 


ESSENTIALS OF GENERAL TOPOLOGY 203 
Homeomorphisms 


Definition. We say that two topological spaces (X, Ix) and (Y,Jy) are homeo- 
morphic if there exists a bijection g : X > Y such that both g and g7! are 
continuous. We call such a function ¢ bicontinuous, or a homeomorphism. 


Intuitively speaking, two topological spaces are homeomorphic if they have iden- 
tical arrangements of open sets. 


Example 3. Any two open bounded intervals are homeomorphic. The linear 
function that maps (0, 1) onto (a, b) is clearly bicontinuous. 


Example 4. The stereographic projection is a homeomorphism from the punc- 
tured sphere onto R?. @ 


Example 5. Not every continuous bijection is a homeomorphism. The function 
f(@ = (cost, sint) is a continuous bijection from the half-open interval [0, 277) 
onto the unit circle. 


Definition. Let (X,%,) and (Y,J;) be topological spaces, let g : X > Y be an 
injection, and let Z = R(/). If g and g~! : Z > X are both continuous, we say 
that y injects X homeomorphically into Y. We also say that X is topologically 
embedded in Y. Here Z is given the restricted topology induced by Jy. 


Example 6. The inverse stereographic projection embeds R? into the unit 
sphere. 
Upper and Lower Semicontinuous Functions 
Definition. A real-valued function f on a topological space X is said to be lower 
semicontinuous if, for every a € R, f~'((a, c0)) is open. We say that fis upper 
semicontinuous if f~'((—0o, b)) is open for every bE R.’ 


Theorem 5.3.4. Let f be a real-valued function on a topological space X. 


(a) fis continuous if and only if it is both upper and lower semicontinuous. 
(b) The characteristic function of an open subset is lower semicontinuous. 


* Lower semicontinuous functions played a significant role in the early development of measure 
theory. Upper and lower semicontinuous functions facilitate a succinct proof of Uryshon’s lemma 
(theorem 5.11.2). 
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(c) The characteristic function of a closed subset is upper semicontinuous. 

(d) If {fea is a family of lower semicontinuous functions, then supa {fg} is lower 
semicontinuous. 

(e) If {fake is a family of upper semicontinuous functions, then infytfa} is upper 
semicontinuous. 


Proof. (a) If f is both upper and lower semicontinuous, then, for all real numbers 
a and b, f-'(a,oo) and f~'(—0o,b) are open. Since intervals of the type 
(a, co) and (—oo,b) form an open subbase for the usual topology on R, f is 
continuous. The converse is trivial. 
(b) If A CX is open, then, for aE R, 7x '(a, 00) is open because 


© ifa>l, 
Xa'Gc0)=4A if0<a<l, 
X ifa<0. 


(c) The proof is similar to that of part (b). 


(d) Let f = supa{ fa}. Since f-!(a, 00) = Ugfz'(a, 00), f(a, 00) is open. 
(e) The proof is similar to that of part (d). 


Exercises 


1. Let X, Y, and Z be topological spaces, and let f: X > Yandg: YZ. 
Prove that 
(a) if fis constant, it is continuous. 

(b) iffand gare continuous, so is gof; 

(c) if A is a subset of X, then the inclusion map from A to X is continuous; 
and 

(d) if fis continuous and A C X, then the restriction of fto A is continuous. 

2. Let X and Y be topological spaces, and let f: X > Y. 

(a) Prove that fis continuous if and only if, for every x € X and every open 
neighborhood V of f(x) in Y, there exists an open neighborhood U of x 
such that f(U) C V. 

(b) Prove that f is continuous if and only if for every subset A of X, 
f(A) f(A). 

3. Let J and F be two topologies on a set X. Prove that J is weaker that F if 
and only if the identity function Iy : (X,F) > (X,7) is continuous. Con- 
clude that J = F if and only if Iy : (X,F) > (X,7) is a homeomorphism. 

4. Suppose that fis a function from a topological space X to a topological space 
Y and that X = U/L, A;, where A,, ...,A, are closed subsets of X. Prove that 
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if each f|,, is continuous, then fis continuous. This result is also true when 
each of the sets A; is open. 

5. Let fand g be continuous real-valued functions on a topological space X. 
Prove that 
(a) f+, fg and |f| are continuous, 
(b) the set {x € X : f(x) < g(x} is closed, and 
(c) the functions h = min{f,g} and k = max{f,g} are continuous. 

6. Prove that the following subspaces of the Euclidean plane are homeomor- 
phic: 
(a) the punctured plane {(x,y) : x7 +’ > 0} 
(b) the open annulus {(x,y) : 1 <x? +y? < 4} 

7. Prove that a discrete topological space (X,7) is homeomorphic to a sub- 
space of R if and only if X is countable. 

8. Let a,b,c, and d be real numbers such that 


aes(? ? >0. 


Show that the function f(z) = a is a homeomorphism of the open upper 
CZ 


half of the complex plane. 
9. Let X = R” — {0}. Prove that the function f(x) = 


x 
Ilxll3 
10. (a) Let fbe a real function on a topological space X. Prove that f is lower 


semicontinuous if and only if —fis upper semicontinuous. 
(b) Prove that a subset A of X is open if v7, is lower semicontinuous. 
(c) Prove that a subset B of X is closed if yz is upper semicontinuous. 

11. Definition. A sequence (x,,) in a topological space X is said to converge to 
x €X if every neighborhood of x contains all but finitely many terms of 
(x,,). See problem 9 on section 4.1. 

Let f be a function from a topological space X to a topological space Y. 
Show that if fis continuous at x) € X, then it is sequentially continuous at 
Xq (see theorem 4.3.2). Also give an example to show that the converse is 


is continuous on X. 


false. 


5.4 The Product Topology: The Finite Case 


In section 4.4, we defined the product of finitely many metric spaces. In this 
section, we develop a construction that generalizes the concept to the case of 
topological spaces. Thus we define a topology on the Cartesian product of a finite 
number of topological spaces. Needless to say, the product topology should agree 
with and extend the definition of the product metric. 
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Let (X,,%),...,(X,,,,) be topological spaces, and consider the Cartesian product 
X= [hs of the underlying sets. Consider the following collection of subsets 
of X: 


SG =VL{X,x...XU,X... XX, 1 U; © F. 


Since U{S : S € SG} = X, theorem 5.2.3 applies; hence the following definition is 
meaningful: 


Definition. In the notation of the above paragraph, the product topology on X 
is the weakest topology that contains ©. 


By construction, © is a subbase for the product topology (theorem 5.2.3). An 
open base for the product topology on X consists of intersections of finitely 
many members of ©. Since NL) (X, xX... XU; X ... XX,) =U, Xx... XU, an 
open base for the product topology is the collection 


B={U,x...xU,, : U,E Fh. 


The set © is referred to as the defining subbase for the product topology and 
the set B is called the defining base for the product topology. 


The following theorem establishes a property of the product topology that 
actually characterizes that topology and is frequently used as an alternative, 
equivalent definition of the product topology. Recall that 7; denotes the pro- 
jection of the product space X onto the factor space Xj: 77)(x,, ...,Xp) = Xj. 


Theorem 5.4.1. The product topology is the weakest topology relative to which all 
the projections 1; : X > X; are continuous. 


Proof. Let U; be open in X;. A set of the type 27 '(U,) =X, X...XU;X... XX, is 
a member of the defining subbase for the product topology on X. Thus 1;'(U;) 
is open in X, and 7; is continuous. Any topology F relative to which all the 
projections are continuous must contain all sets of the form ;'(U;) for all 
1 <i<nandall U; € J. Thus F contains the defining subbase © of the product 
topology. By theorem 5.2.3, the product topology is weaker than F. 


Comparing the above result with problems 2 and 5 on section 4.4 should convince 
the reader that the product topology defined in this section is indeed the correct 
generalization of the product metric. 
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Theorem 5.4.2. If F; is closed in X,, then eae F; is closed in Ts 
Proof. The proof is identical to that of theorem 4.4.3. 


Theorem 5.4.3. Let 8; be an open base for the topology J; on X;. Then 
]]B=Mx...xv, + V;,€ 8} 
i=1 


is an open base for the product topology on | ee €: 


Proof. Let W be open in ghee and let x =(x,,...,X,) € W; W is the union of 
sets of the type U, X ... X U,, where U; € J;. Therefore, for a set of that type, x € 
U, x... XU, C W. For each x;, choose amember V; € 8; such that x; € V; C Uj. 
Clearly, x EV, X...XV, CU, X...xU, CW. 


Exercises 


1. Let X and Ybe topological spaces, and let x be a fixed element of X. Prove that 
Y is homeomorphic to {x} x Y. The latter set is given the restricted topology 
induced by the product topology on X x Y. 

2. Prove that X, x ... x X,, is homeomorphic to X, X (X, x... X X,). 

3. Let X and Y be topological spaces, and let AC X and B CY. Prove that 
AXB=AxB. 


Definition. A function ffrom a topological space X to a topological space Y 
is said to be an open mapping if f(U) is open in Y for every open subset U of 
X. Similarly, fis a closed mapping if it maps closed subsets of X into closed 
subsets of Y. 


4. Prove that the projections 7; from a product space ID. onto the factor 
spaces X; are open. Also give an example to show that the projections need 
not be closed mappings. 

5. Let X,,Xy,Y,, and Y, be topological spaces, and let f, : X; > Y; be con- 
tinuous, i= 1,2. Prove that the function F : X, xX, > Y, x Y, defined by 
F(x,,X2) = fi (%1),fo(x2)) is continuous. 

6. Let X be a nonempty set (no topology), and let {Y,} be a collection of 
topological spaces. Show that, for an arbitrary collection of functions fy : 
X — Yq, there is a unique smallest topology on X relative to which all the 
functions fy are continuous. 
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7. Prove that if A; is dense in X; for 1 <i<n, then IL-4: is dense in the 
product topology. 

8. Let X be an infinite set, and let J be the co-finite topology on X. Prove that 
the product topology on X x X is not the co-finite topology. 


5.5 Connected Spaces 


Intuitively speaking, a disconnected space comes in two pieces. One might be 
tempted to define a disconnected space as the union (X; UX2,4,U%) of two 
topological spaces (X,,%) and (X,,%) where X,NX,=@. A little reflection 
reveals that J U J is not a topology. A topology J on X = X, UX, that contains 
J, UF; must contains U, UU, for any two open sets U, € FZ and U, EJ. In 
particular, X,; € J and X, € J. Thus X is the union of two open, disjoint proper 
subsets of X. This leads us to the following definition. 


Definition. A topological space is said to be connected if it is not the union of two 
disjoint nonempty open subsets. If X is not connected, we say it is disconnected. 
Thus X is disconnected if X = PUQ, where P and Q are open, disjoint, and 
P4@#Q. The pair (P,Q) is called a disconnection of X. It is clear that X is 
disconnected if and only if it contains a proper, nonempty subset that is both 
open and closed. 


Example 1. The space X= {0,1} with the discrete topology is disconnected 
because it is the union of the open sets {0} and {1}. We will refer to this space as 
the discrete space {0, 1}. 


Theorem 5.5.1. A topological space X is disconnected if and only if there exists a 
continuous function from X onto the discrete space {0, 1}. 


Proof. Let X be disconnected, and let (P,Q) be a disconnection of X. The function 
gy : X > {0,1} defined by p(P) = 0, and g(Q) = 1 is clearly continuous. 
Conversely, if p : X — {0, 1} is a continuous surjection, then P = p~'(0) and 
Q= ¢g"'(1) is a disconnection of X. 


Definition. A subset X of R is an interval if whenever x,y € Xandx <z<y, then 
zEXx. 


Theorem 5.5.2. A subset X of IR is connected if and only if it is an interval. In 
particular, R is connected. 
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Proof. Here X is given the relative topology induced by the usual topology on R. 
If X is not an interval, then there exist two real numbers x and y in X and a 
real number z € R—X such that x <z<yy. The two sets P= XM(—oo,z) and 
Q=XN(zZ,0o) form a disconnection of X. 


Now suppose, contrary to our assertion, that X is a disconnected interval, and 
let p : X > {0,1} be a continuous surjection. Since @ is onto, there exist two 
real numbers a,b € X such that p(a) =0, and o(b) = 1. Without loss of gen- 
erality, assume that a<b (otherwise, replace p with 1—@). Since X is an 
interval, the closed interval [a, b] is contained in X. Define P = [a,b] N (0) and 
Q= [a,b] Nv~'(1). Clearly, P and Q are closed and nonempty and partition 
[a,b]. We claim that there exist sequences a, € P and b,, € Q such that (a,) is 
non-decreasing, (b,,) is non-increasing and b,, — a, = a This immediately leads 
to a contradiction because then a = lim, a, = lim, b, =b €E PNQ=@. 

We now construct the sequences (a,,) and (b,,). Define a, = a, b, = b. Having 
found ay,...,a, and by,...,b,, let m =a If me P, define a,,, =m, and 
bys, =b,. If m € Q, define a,,, =a,, and b,,,, =m. The sequences (a,) and 
(b,,) have the stated properties. 


Theorem 5.5.3. The continuous image of a connected space is connected. 


Proof. Let X be a connected space, and let f be a continuous surjection of X onto a 
topological space Y. If Y is disconnected, there is a continuous surjection @ : Y > 
{0, 1}. In this case, the function pof would be a continuous surjection from X onto 
{0, 1}. This contradicts the connectedness of X and proves that Y is connected. 


Example 2. The closed interval [0, 1] is not homeomorphic to the circle S!. 


Suppose there exists a homeomorphism f : [0,1] > S'. Then the restriction 
of f to the connected subset (0,1) would be a homeomorphism. But this is a 
contradiction because f((0, 1)) is the circle with two missing points, which not 
connected. 


The following result follows directly from the last two theorems. 


Corollary 5.5.4 (the intermediate value theorem). If X is connected and f: 
X — R is continuous, then f(X) is an interval. @ 


Example 3. (a) Let f : [a,b] ~ R bea continuous function and, say, f(a) < f(b). 
If k is between f(a) and f(b), then there exists a point x € (a,b) such that 
fo =k. 
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(b) Let f : [a,b] — [a,b] be continuous; then fhas a fixed point in [a, b]. 
Since the range, R, of f is connected, it is an interval. In particular, R 
contains the interval [f(a), f(b)]. This proves (a). We now prove (b). 
If f(a) =a or f(b) = J, there is nothing to prove, so assume that f(a) > 
a and f(b) <b. Define a function h on [a,b] by h(x) =x—f(x). Then 
h(a) < 0 < h(b). By (a), there is a point x € [a, b] such that h(x) = 0, that is, 
fw=x. 


Theorem 5.5.5. If X and Y are connected, then the product X x Y is connected. 


Proof. Let (xo, Vo) and (x,,y,) be arbitrary but fixed elements in X x Y. Suppose 9 : 
XxX Y = {0, 1} is continuous. The function i : Y > {xo} x Y given by i(y) = (x,y) 
is continuous; hence poi is continuous and hence constant because Y is connected. 
Thus (Xo, Vo) = P(xp,1). Likewise, the function x 9(x,y,) is constant, so 
P(X0.V1) = P(X1,y1). Thus p(x1,¥1) = P(X, Vo), and ~ is constant. This proves 
that X x Y is connected. 


Corollary 5.5.6. IR” is connected. 


Proof. Use induction, the previous theorem and the fact that IR" is homeomorphic to 
RxR"|. # 


Definition. A subset A of a topological space (X,7) is connected if it is a 
connected space with respect to the restricted topology on A induced by J. 


Example 4. The set X = (—1,0)U (0, 1) is a disconnected subspace of R. This is 
because (—00,0) NX = (—1,0) and (0, 00) NX = (1,0); hence both (—1,0) and 
(1,0) are open in X. 


Theorem 5.5.7. Let {Ag}q be a collection of connected subsets of a topological space 
X such that NgAg # ©. Then A = UgAg is connected. 


Proof. Let p : A — {0,1} be continuous. Fix an element bE NgAg. For anyac A, 
a € Ag for some a. The restriction of p to Ay is continuous; therefore p(a) = (b) 
since Ay is connected. Thus y is not onto, and A is connected. 


Theorem 5.5.8. Let A be a connected subset of a topological space X. If Bis such that 
AC BCA, then B is connected. In particular, A is connected. 


Proof. Let p : B > {0,1} be continuous. Since A is connected, (|, is constant; say, 
(A) = 0. Now {0} is closed in {0,1}, so ~1(0) is closed in B and contains A. 
Therefore p~'(0) contains the closure of A in B. But the closure of A in B is 
ANB=B. Thus g(B) = 0, and ¢ is not onto, showing that B is connected. 
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Definition. Let X be a topological space, and let x,y € X. We say that two points x 
and y in X are connected if there is a connected subset of X that contains x and 
y. Define a relation = on X by x = y if x and y are connected. It is clear that = is 
an equivalence relation. 


Theorem 5.5.9. The equivalence classes of the relation = in the above definition are 
connected sets. 


Proof. Let C be one of the equivalence classes and fix an element a € C. For every 
x€C, there exists a connected subset A, of X containing a and x. All the 
elements of A, are related; hence A, © C. Since C=UyecAy and a € NyecAys 
C is connected by theorem 5.5.7. 


Definition. The equivalence classes of the relation = are called the connected 
components of X. 


If A is a connected subset of X, then all the elements of A are connected (hence 
related). Therefore A is contained in exactly one of the connected components of 
X. The summary of the above discussion is that the connected components are 
the maximal connected subsets of X. If Cis a connected component of X, then C 
is also connected by theorem 5.5.8. Thus C is contained in a unique connected 
component of X. Since CN C#@,CCC. ThusC=C. 

We have proved most of next result. 


Theorem 5.5.10. A topological space X is the disjoint union of a collection C of 
disjoint, connected, closed subsets of X, namely, the connected components of 
X. Every connected subset of X is contained in exactly one of the connected 
components of X. Every proper, nonempty subset of X that is both open and closed 
is the union of connected components of X. 


Proof. The last assertion of the theorem is the only one we still need to prove. Let P 
be a proper nonempty subset of X that is both open and closed, and let Q = X — P. 
Then @ # Q# X, and Q is also open and closed. We show that if C is a connected 
component of X, and CNP # @, then C C P. The sets CN P and Cn Q are both 
open and closed in C. Since C is connected, and CNP # @, CNQ=@, because 
otherwise the pair (CN P,CN Q) would form a disconnection of C. This proves 
thatC CP. 


We conclude this section with a brief excursion into path connected spaces. 


Definition. Given two points x and y in a topological space X, a path from x to y 
is a continuous function f : [0,1] > X such that f(0) = x, and f(1) = y. If there 
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is a path from x to y, we say that x and y are path connected. A topological space 
X is path connected if every pair of points in X are path connected. 


Example 5. Every path connected space is connected. 
Let x and y be points in a path connected space X, and let fbe a path from 
x to y. The set {f(£) : t © [0, 1]} is connected and contains x and y. This shows 
that every two points in X are connected, and hence X is connected. 


Example 6. The space X = R” — {0} is path connected. 
Let x and y be points in X, and consider the line segment, L, that joins x and 
y. If L does not contain 0, then L is the path we need. If 0 € L, take a point z € X 
not on L. The union of the two line segments that join x and y, and then y and 
z, is a path from x to z. 4 


Example 7. For n > 1, the sphere S”~! is path connected. 
The function f(x) = x/||x||, maps R” — {0} continuously onto 8"~'. The result 
now follows from example 6 and problem 12 at the end of this section. @ 


Exercises 


1. Prove that a subset X of R is an interval (according to the definition in this 
section) if and only if X has one of the following types: (—0o, co), (—co, a), 
(—oo, a], (b, 00), [b, 00), [a, b), (a, b], [a,b], or (a,b). Here a and b are real 
numbers, and a < b. 

2. Prove that the intervals [0, 1) and (0,1) are not homeomorphic. Also show 
that [0,1] and [0, 1) are not homeomorphic. 

3. Show that, for n > 1, IR” is not homeomorphic to R. 

4. Prove that a topological space X is connected if and only if every nonempty 
proper subset of X has a nonempty boundary. 

5. Let X be connected. Show that if there exists a continuous, nonconstant 
function f : X > R, then X is uncountable. 

6. Prove that if a subset A of a topological space X is connected, open and 
closed, then A is a connected component of X. 


Definition. A topological space (X, 7) is called totally disconnected if the 
connected components of J are singletons. 


7. Prove that Q (with the usual topology) is totally disconnected. This result 
shows that the connected components of a topological space need not be 
open. 
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8. Prove that a topological space X is totally disconnected if, for every pair of 
distinct points x and y, there is a disconnection (P, Q) of X such that x € P 
and yEQ. 

9. Prove that ifa Hausdorff space X has an open base whose members are also 
closed, then X is totally disconnected. The definition of a Hausdorff space 
appears in the next section. 

10. Prove that the Sorgenfrey line is totally disconnected. 

11. Prove the the product of two totally disconnected spaces is totally discon- 
nected. 

12. Prove that the continuous image of a path connected space is path con- 
nected. 

13. Prove that the set {x € R” : ||x||, > 1} is path connected. 


Definition. Define a relation © on a topological space X by x = y if x and 
y are path connected. 


14. Prove that © is an equivalence relation. The equivalence classes of * are 
called the path connected components of X. 

15. It follows from the above exercise that the path connected components 
partition X. Prove that if A is a path connected subset of X, then A is 
contained in exactly one of the path connected components. 

16. LetA = {(%sin(-)) > 0 <x < 1/7}. Clearly, A is path connected and hence 


connected. By theorem 5.5.8, the closure A of A in R? is also connected. 
Show that A is not path connected. Notice that A = AU{(0,y) € R* : -1< 


y< lh. 


5.6 Separation by Open Sets 


Metric spaces enjoy strong separation properties, which we often take for granted. 
For example, two distinct points in a metric space have disjoint open neigh- 
borhoods. In chapter 4, we called this property the Hausdorff property. There 
is no reason to expect that the same property should hold true for an arbitrary 
topological space, so this property must be axiomatized. Similarly, theorem 4.2.13 
shows that disjoint closed subsets of a metric space possess disjoint open neigh- 
borhoods. In the general topological setting, this property is known as normality. 
One important problem in topology is that of the metrizability of a topological 
space. Explicitly stated, under what set of conditions is a given topology induced 
by a metric. The fact that every metirc space is normal imposes an immediate 
necessary condition on a topology to be metrizable: such a topology must be 
normal. Of course, normality is not a sufficient condition for a space to be 
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metrizable. In section 5.11, we prove a metrization theorem that gives a sufficient 
set of conditions for a topology to be metrizable. In this section, we study the three 
most common forms of separating points and sets in a topological space. 


Definition. A topological space X is said to be a T, space if, for every pair of 
distinct points x and y in X, there exists a neighborhood of x not containing 
y and a neighborhood of y not containing x. The two neighborhoods may 
intersect. 


Definition. A topological space X is said to be Hausdorff (or T;) if for every pair 
of distinct points x and y, there is an open neighborhood U of x and an open 
neighborhood V of y such that UN V= @. 


It is safe to say that all important topological spaces are Hausdorff. Weaker 
separation axioms, such as T,, are used mostly to generate exercises and 
counterexamples. 


Theorem 4.1.4 states that a metric space is Hausdorff, which supports the 
statement in the above paragraph since metric spaces are the most important 
(but not the only important) examples of topological spaces. 


Theorem 5.6.1. If X is a Hausdorff space and x € X, then {x} is closed. 


Proof. We show that the set W = X — {x} is open. For every y € W, there exist open 
neighborhoods U, and V, of x and y, respectively, such that U,N V, = ©. This 
clearly implies that V, C W for all y € W. Consequently, W = UV, : ye W}, 
which is open. @ 


Definition. A sequence (x,,) of a topological space X is said to converge to a point 
x € X if every neighborhood of x contains all but finitely many terms of the 
sequence. 


Theorem 4.1.5 says that the limit of a convergent sequence in a metric space is 
unique. This is precisely because metric spaces are Hausdorff spaces. 


Example 1. Let X be a Hausdorff space, and suppose that (x,,) is a convergent 
sequence. Then the limit is unique. 

Suppose that lim,,x, =x and lim,x, =y and that x #y. Let U and V be 
disjoint open neighborhoods of x and y, respectively. Since lim,,x,, = x, there 
is an integer N such that, for all n > N, x, € U. Since UN V= @, V can contain 
only finitely many terms of (x,,), which is a contradiction. @ 
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Example 2. Let X be a topological space, and let Y be a Hausdorff space. If f: 
X — Yis continuous, then the graph of f, G = {(x,f(x)) : x € X}is closed in the 
product space X x Y. 

We will show that the complement of G is open in X x Y. Let (x,y) € G, thus 
y # f(x). Let Vand Vbe disjoint open neighborhoods of y and f(x), respectively. 
Because f is continuous, there exists an open neighborhood W of x such that 
f(W) CG V. It is easy to check that (Wx UJNG=@. 


Definition. A Hausdorff space X is said to be regular if, for every x € X and every 
closed subset F that does not contain x, there exist open sets U and V such that 
x€U,FCV,and UNV=@. 


Theorem 4.2.12 states that a metric space is regular. 


Example 3. A subspace of a regular space X is regular. 

Let Y be a subspace of X, let F be a closed subset of Y (in the restricted 
topology on Y), and let x € Y— F. By theorem 5.1.6, F= FY, where F denotes 
the closure of F in X. Now x ¢ F, so by the regularity of X, there exist disjoint 
open neighborhoods U and V of x and F, respectively. The sets U; = UN Y and 
V, = VN Yare open in Y and separate x and F. @ 


The following characterization of regularity is often useful. 


Theorem 5.6.2. A Hausdorff space is regular if and only if for every x € X and every 
open neighborhood U of x, there exists an open neighborhood V of x such that 
VEU. 


Proof. Suppose X is regular, and let x and U be as in the statement of the theorem. By 
regularity, applied to x and the closed set X — U, there exists open neighborhoods 
V of x and W of X—U such that Vn W = @. Because V C X — W and the latter 
set is closed, VC X—W. In particular, VCU. 

Conversely, let F be a closed subset of X that does not contain x. By assumption, 
there exits an open neighborhood U of x such that UC X—F. Set V=X-U. 
The sets U and V are disjoint open neighborhoods of x and F, respectively, 
as desired. 


Definition. A Hausdorff space X is said to be normal if, for every pair of disjoint 
closed subsets E and F of X, there exist open sets U and V such that EC U, 
FC V,and UNV=@. 


Theorem 4.2.13 states that a metric space is normal. 
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The proof of theorem 5.6.3 mimics that of theorem 5.6.2 and is therefore omitted. 


Theorem 5.6.3. A Hausdorff space X is normal if and only if for every closed set E 
and every open neighborhood U of E, there exists an open neighborhood V of E 
such that VC U. 


Products and subspaces of normal and regular spaces have dissimilar properties. 
For example, the product of regular spaces is regular, but the same result does not 
hold for the product of normal spaces. Likewise, an arbitrary subspace of a normal 
space need not be normal. See the exercises on section 5.7. However, the following 


special case is easy to prove. 


Example 4. A closed subspace Y of a normal space X is normal. 
Let E and F be closed subspaces of Y. Since Y is closed in X, E and F are closed 
in X. By the normality of X, there exists disjoint open neighborhoods U and V 
of E and F, respectively. The sets U;) = UN Yand V, = VN Yare open in Y and 
separate E and F. @ 


9. 
10. 


Exercises 


. Prove that a topological space is T, if and only if every single-point set of X 


is closed. 


. Let X be an infinite set. Prove that the co-finite topology on X is T, but not 


Hausdorff. 


. Let A be subset of a Hausdorff space X. Show that a point x € X is a limit 


point of A if and only if every neighborhood of x contains infinitely many 
points of A. 


. Prove that a subspace of a Hausdorff space is Hausdorff and that the product 


of two Hausdorff spaces is Hausdorff. 


. Prove that a topological space X is Hausdorff if and only if the diagonal set 


{(x,x) : x € X} is closed in the product space X x X. 


. Let X and Y be topological spaces, and let fg : X — Ybe continuous. Prove 


that if Y is Hausdorff, then the set {x € X : f(x) = g(x)} is closed. 


. Prove that the set of fixed points of a continuous function on a Hausdorff 


space is closed. 


. Let f and g be continuous functions from a topological space X to a 


Hausdorff space Y. Show that if fand g agree on a dense subset of X, then 
L=z 

Prove that the product of two regular spaces is regular. 

Prove theorem 5.6.3. 
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11. Let X be a regular space. Prove that every pair of distinct points in X have 
neighborhoods whose closures are disjoint. 

12. Let X be a normal space. Prove that every pair of disjoint closed subsets of 
X have neighborhoods whose closures are disjoint. 


5.7 Second Countable Spaces 


In this section, we study second countable, separable, and Lindeléf spaces. Theo- 
rem 4.5.1 states that all three conditions are equivalent for metric spaces. This is 
not true for general topological spaces, and several counterexamples are provided 
in this section and the section exercises to show the nonequivalence of the three 
conditions. However, second countability implies the other two conditions. Sec- 
ond countability has other pleasant consequences, especially when it is combined 
with normality or local compactness. The definitions in this section are identical 
to the those in the metric case and are included below for ease of reference. 


Definition. A subset A of a topological space X is dense in X if A = X. 


Definition. A topological space X is separable if it contains a countable dense 
subset. 


Definition. A topological space X is second countable if the topology on X 
contains a countable open base. 


Definition. A topological space X is said to be a Lindel6f space if every open 
cover of X contains a countable subcover of X. The definitions of open covers 
and subcovers can be seen in section 4.5. 


Example 1. Consider the Sorgenfrey plane, R7 = R,xR;. In problem 11, we 
ask the reader to show that Rj is separable. Here we show that the subspace 
L={(x,—x) : x € R} is not separable. We claim that restriction of the topology 
on R; to L is the discrete topology. Since L is uncountable, it is not separable. To 
prove our claim, let x € R, and consider the set U = [x,x + 1) x [—x,-x+ 1); U 
is open in R?, and UNL = {(x,—x)}. Therefore the single point (x, —x) is open 
inL.¢ 


Example 2. In problem 10, we ask the reader to show that the Sorgenfrey line 
R, is Lindeléf. We show here that R? is not Lindeléf. Thus the product of two 
Lindel6f spaces is not necessarily Lindeléf. Let L be as in example 1. The line L is 
closed in R?. Consider the open cover U of R7 that consists of {R7 — L} and the 
collection {[x,x + 1) x [—x,-x +1) : x © R}. Clearly, no countable subset of U 
can cover R7. @ 
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Definition. A collection § = {Fy : a € J} of subsets of a nonempty set X is said 
to have the countable intersection property if every countable subcollection 
of % has a nonempty intersection. 


Example 3. A topological space X is Lindelof if and only if every collection of 
closed subsets of X with the countable intersection property has a nonempty 
intersection. 

Suppose X is Lindel6f, and let % be a collection of closed subsets with the 
countable intersection. IfN{Fy : a € I} = @, then X = Uge/(X — Fy). Therefore 
X = Un (X— Fy), for some countable subset {a, 2, ...} of I. It follows that 
Nnr=1Fa, = O; a contradiction. 

Conversely, if {Ug}qe; is an open cover of X with no countable subcover, 
then the family {Fy : a €]} ={X—U, : a € thas the countable intersection 
property because, for a countable subcollection {Fg, , Fy,,.-.} of B mi Fa, = 
Nn (X — Ug,) = X— Un Ug, # O. However, NeerFa = NaelX — Ug) = X- 
(UgerUa) =D. 


Theorem 5.7.1. A subset A of a topological space X is dense if and only if it intersects 
every open subset of X. 


Proof. We prove the contrapositive of each implication. If A#X, then the set 
U=X-—A is open, nonempty and UN A= @. 
Conversely, if there exists a nonempty open set U such that UNA =@, then 
ACX-U. Since X—Uis closed, ACX-U#X. 


Theorem 5.7.2. In a separable topological space X, every collection of pairwise 
disjoint open sets is countable. 


Proof. We prove that if X contains an uncountable collection of pairwise disjoint sub- 
sets, then any dense subset A of X is uncountable. Let {Ug}qe; be an uncountable 
family of pairwise disjoint open subsets of X. By theorem 5.7.1, each Uz intersects 
A. Choose an element ag € Ug NA. Now ag # ag if a#B since UyNUg = ©. 
Hence A is uncountable. 


Theorem 5.7.3. Let X be a second countable topological space. Then 


(a) X is separable, and 
(b) X is Lindeléf. 


Proof. Let {B,} be a countable open base for the topology on X. For each nEN, 
choose a point a, € B,, and let A= {a, : n€ N}. If UF @ is open in X, then U 
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contains a basis element B,,, and hence a, € ANU. Theorem 5.7.1 implies that A 
is dense. The proof of (b) is identical to that in theorem 4.5.1. @ 


It was observed in section 5.6 that normality is a necessary condition for the 
metrizability of a topological space. For second countable spaces, the normality 
requirement can be relaxed, as the following theorem shows. As it turns out, 
regular second countable spaces are metrizable. See the Urysohn metrization 
theorem in section 5.11. 


Theorem 5.7.4. A regular second countable Hausdorff space X is normal. 


Proof. Let E and F be disjoint closed subsets of X, and let 8 be a countable open base 
for the topology on X. For every x € E, x belongs to the open set X — F. By theorem 
5.6.2, there exists an open neighborhood W of x such that W C X— F. Choose a 
basis element B,, such that x € B, © W. Clearly, E C U,e¢B,. Since B is countable, 
the collection {B,}.cp can be enumerated as {U,,}. Observe that U,CX-F.A 
similar argument produces a countable open cover {V,,} of F such that V,, € B, 
and V,, CX—E. 

Define U!, = U, — UL, V;, and Vj, = V,, — UL, U;. Notice that if n < m, then 
UL, Vin =O. By symmetry, if m<n, then Vi,NU;, =@. It follows that, for 
allm,n EN, UN V;,, = ©. Now define U = UP, U;, and V= Ue, V7. Clearly, 
UNV=@, and it is straightforward to verify that EC Uand FC V. 


Example 4. Let X be a second countable topological space, and let © be an open 
base for the topology on X. Then © contains a countable subset which is also 
an open base for X. 


Let B = {B,, : n © N} be a countable open base. Let I be the subset of N x N of 
pairs (m,n) for which there is a member C € © such that B,, C C C B,,. For each 
pair (m,n) € I, choose a member C,, ,, of © such that B,, C C, », € B,,. We show 
that the countable collection {C,,,, : (m,n) € I} is an open base. Let U be an 
open set, and let x € U. Because % is an open base, there is a set B,, such that 
x © B, C U. For the same reason, there is a member C of © such that x € C C B,,. 
Finally, there is an element B,, of 8 such that x € B,, € C. Clearly, (m,n) € I, 
andx€C,,, CU. 4 


Exercises 


1. Prove that the product of two second countable spaces is second countable 
and that the product of two separable spaces is separable. 
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Definition. A point x of a subset A of a topological space X is said to be 
isolated if x has an open neighborhood U such that AN U= {x}. 


. Prove that the set of isolated points of a second countable space X is at 


most countable. Then show that if X is uncountable, then X has uncountably 
many limit points. 


Definition. A topological space X is said to be first countable if, for every 
x € X, there is a countable collection {U,,} of open neighborhoods of x such 
that, for every open neighborhood U of x, there is an integer n such that 
x € U, CU. The collection {U,,} is called a local base at x. It is sometimes 
convenient to have a local base {V,,} with the additional property that 
V,, 2 V,,41- This can be easily achieved by defining V,, = Nj_, Uj. 


. Prove that every second countable space is first countable and that every 


metric space is first countable. 


. Show that a subspace of a second (respectively, first) countable is second 


(respectively, first) countable. Also show that the product of two first 
countable spaces is first countable. 


. Show that a subspace of a separable space need not be separable. Hint: See 


problem 3 on section 5.1. For a more elaborate example, see problem 12 
below. 


. Let X be an uncountable set, and let J be the co-finite topology on X. 


Show that every infinite subset of X is dense, and hence X is separable. 
Show, however, that X is not second countable. Hint: If {B,,} is a countable 
collection of open subsets of X, then NB, is uncountable. Pick a point 
x © Ne_,B,, and consider the open set U = X — {x}. 


. Show that a closed subspace of a Lindel6f space if Lindelof. 
. Let X be a topological space X, and let B be an open base for X. Prove that 


X is Lindelof if and only if every open cover of X by members of B has a 
countable subcover. 


. Show that the Sorgenfrey line is first countable, separable, but not second 


countable. Hint: To show that R; is not second countable, let 8 be an open 
base for R;. For every and x ER, there is a member B, € B such that 
x€B,Cl[x,x+1). 

Prove that the Sorgenfrey line R; is Lindel6f. Together with the previous 
problem, this problem shows that not every Lindelof space is second 
countable. Hint: Use problem 8. Let {[a,,b,) : a € I} be an open cover of 
IR; by basic open subsets of R;. Define C = Uge/(ag, bg). View Cas a subset 
of R with the usual topology; Cis Lindelél because R is a metric space. Thus 
there exists a countable subset {a,, : n € N} such that C = Ur) (Gg, bq, ). 
Argue that IR — C is countable. 
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11. Show that the Sorgenfrey plane Rj is separable. 

12. Show that the line L in example 2 is closed in R?. 

13. Let X be a topological space, and let B be an open base for the topology on 
X. Suppose that 8 has infinite cardinality X. Prove that if € is another open 
base for the topology on X, then © contains a subset of cardinality < & that 
is also an open base for X. 


5.8 Compact Spaces 


In section 4.7, we studied compact metric spaces extensively, including several 
equivalent formulations of the definition of compactness. We adopt the same 
definition of compactness in this chapter because the other characterizations do 
not lend themselves easily to generalization to general topological spaces, and 
especially because some of the other characterizations of compact metric spaces 
are false in general. You will also see that compact spaces have pleasant separation 
properties. Finally, we will prove the celebrated Tychonoff theorem for the product 
of finitely many topological spaces.The leading theorems in this section have 
counterparts in section 4.7. Therefore, proofs that duplicate those in section 4.7 
will be omitted. 


Definition. A topological space X is said to be compact if every open cover of X 
contains a finite subcover of X. 


Example 1. The co-finite topology on an infinite set X is compact. 
Let U be an open cover of X, and fix an element U, € U. The complement of 
U, is finite, say, U, = X— {x2,...,x,}. Now, for each 2 <i <n, pick an element 
U; € U that contains x;,. The finite collection {Uj,...,U,,} covers X. @ 


Example 2. A real-valued, locally bounded function f on a compact space X is 
bounded. 

By the definition of local boundedness, for every x € X, there exists a positive 
number M,, and an open neighborhood U,, of x such that sup,cy,|f(x)| < Mx. 
Clearly, {U,, : x © X} is an open cover of X. Choose points x,,...,x,, € X such 
that the sets U,,,...,U,, cover X, and let M = max) <j<,M,,. For x € X, x € U,, 
for some 1 <i <n, and |f(x)|<M,, <M. 4 


Definition. Let K bea subset of a topological space X. We say that K is a compact 
subset (or a compact subspace) of X if it is compact in the restricted topology. 


Theorem 5.8.1. A subset K of a topological space X is compact if and only if it 
satisfies the following condition: if U is a collection of open subsets of X such that 
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KCUUU : UE Uh, then there exists a finite subcollection {U,, Up,...,U,,} of U 
such that K € UL Uj. 
The proof is identical to that of theorem 4.7.1. @ 


Theorem 5.8.2. A closed subspace K of a compact space X is compact. 
The proof is identical to that of theorem 4.7.2. 


Example 3. Every compact space has the Bolzano- Weierstrass property. 

It is sufficient to prove that if a subset A of a compact space X has no limit 
points, then it is finite. By problem 9(b) on section 5.1, A is closed. By theorem 
5.8.2, A is compact. Every point a € A is not a limit point of A; hence there exists 
an open set U, of A such that AN U, = {a}. If A is infinite, then the open cover 
{U, : a€ A} of A would have no finite subcover. This forces A to be finite, as 
claimed. ¢ 


The converse of the above example is false, but counterexamples are rather difficult. 


Theorem 5.8.3. A compact subspace K of a Hausdorff space X is closed. 
The proof is identical to that of theorem 4.7.3. 


Example 4. Let {Ky}qe; be a collection of compact subsets of a Hausdorff space X. 
IfNgKy = @, then the intersection of some finite subcollection of {K,} is empty. 


Let V, =X—Kg, and fix and element a, € I. By assumption, {Vz} covers 
X and hence Kg,- Thus there exists a finite subset {a,...,@,,} of I such that 
Kg, © Ujz2 Va,- It follows directly that Ny <j<nKa, =O. 


Theorem 5.8.4. The continuous image of a compact space is compact. 
The proof is identical to that of theorem 4.7.4. 


Theorem 5.8.5. A continuous real-valued function f : X > R on a compact space 
X is bounded and attains its maximum and minimum values. 
The proof is identical to that of theorem 4.7.12. 


The next result follows immediately from theorem 5.3.3 and the fact that for a 
compact space X, C(X) = BC(X). 


Theorem 5.8.6. Let X be a compact Hausdorff space, and let C(X) be the space 
of continuous functions on X. Then (C(X),||-||,.) is a complete normed linear 
space. @ 
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Theorem 5.8.7. Let X be a compact space, and let Y be a Hausdorff space. Then a 
continuous bijection p : X > Y is a homeomorphism from X to Y. 


Proof. We prove that p~' is continuous by showing that o is a closed mapping. Let 
F be a closed subset of X. By theorem 5.8.2, F is compact. By theorem 5.8.4, p(F) 
is compact in Y. Now theorem 5.8.3 implies that p(X) is closed, as desired. 


The theorem says that when we limit our attention to compact Hausdorff spaces, 
a bijection g : X > Y isa homeomorphism if and only if it is simply continuous. 
In this situation, we can show that X and Y are homeomorphic by merely showing 
the continuity of g or g~! or by showing that g (or g~') is an open (or a closed) 
mapping. 


Definition. A collection % of subsets of a nonempty set X is said to have the 
finite intersection property if every finite subcollection of % has a nonempty 
intersection. 


The next theorem provides a useful equivalent characterization of compactness. 
Its proof is left as an exercise. See example 3 on section 5.7. 


Theorem 5.8.8. The following are equivalent for a topological space X: 


(a) X is compact. 
(b) If § =tFy : a EI} is a collection of closed subsets of X satisfying the finite 
intersection property, then Fy: acR4¢O.— 


Compactness and Separation 


Theorem 5.8.9. Let X be a Hausdorff space, and let F be a compact subset of X. 
For every x € X — F, there exist disjoint open sets U and V such that x € U, and 
FCYV. 


Proof. For every y € F, there exist disjoint open sets U, and V, such that xEU, 
and y € V,. Now FC UyerV,. Since F is compact, F C Ui, Vy, for a finite sub- 
set {y1,--.Yn} of F. The sets U=Nj=\Uy, and V=Uj-\ Vy, have the desired 
properties. & 


Theorem 5.8.10. A compact Hausdorff space is normal. Thus if E and F are disjoint 
closed subsets of X, then there exist disjoint open subsets U and V such that EG U 
and F CV. 
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Proof. First observe that E and F are compact by theorem 5.8.2. Let x € E. By the 
previous theorem, there are disjoint open sets U, and V,, such that x € U, and 
FCV,,. Since E C UyegU,, and E is compact, E € Uj, Uy, for some finite subset 
{x1,.-,X,} of E. Set U= Uj, U,, and V= iz V,,. The sets U and V have the 
stated properties. Mi 


Finite Products of Compact Spaces 


Lemma 5.8.11 (the tube lemma). Let X be a topological space, and let Y be a 
compact space. If an open subset W in XX Y contains a line, {x} x Y, then there 
exists a neighborhood U of x such that Ux Y € W. Here x is a fixed element of X. 


Proof. For every y € Y, there are open sets U, € X and V, C Y such that (x,y) € 
U,x V, CW. Thus {x} Y C Uyey(U, x V,) © W. The compactness of {x}x Y 
yields a finite subset {yj,...,¥,} such that {x} x Y C Uj, (U,, x V,,) © W. Define 
U=Ni1U,,. We claim that UX YC W. Ifu EU, and y EY, then y € V,, for 
some 1 <i<n. But u belongs to U,, for every 1 <i<n. Therefore (u,y) € Uy, x 
V,,cW. i 


The above lemma says that if an open subset of Xx Y contains a line, then it 
must contain a strip (or a tube, hence the name) that contains the line. Intuitively, 
an open subset of X x Y cannot get arbitrarily thin around a line. The following 
example illustrates the concept. 


Example 5. The open subset W = {(x,y) ER? : xER,|y| < —} contains the 


x-axis but there is no positive number 6 such that R x (—6,6) is contained 
inW.¢ 


Theorem 5.8.12 (Tychonoff’s theorem). If X and Y are compact spaces, so is 
XXY. 


Proof. Let W be an open cover of XX Y. For xEX, {x}X YOUIW: WE Wh. 
Since {x} Y is compact, there exists a finite subset {Wj,...,W7,$ of W such 
that {x}x YC Ue We. Let W*=U;2, W*. By the previous lemma, there exists 
an open neighborhood U* of x such that U* x Y € W*. The collection of open 
sets {U*t.<x covers X; hence X = Ui, U%, for some finite subset {x1,...,X,,} of X. 
We claim that the finite collection {wi 1 1<j<m,1<i<n,} covers Xx Y. Let 


Ny, f 

(x,y) € Xx Y. Thenx € U% forsome1l <j <m.Now(x,y)€USX YC Ge: 
xj 7 

Therefore (x,y) € W,’ forsome1<i< ny. 
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Theorem 5.8.13 (Tychonoff’s theorem). If X,,...,X,, are compact spaces, then 
IL. %: is compact. 


Proof. Use induction, the previous theorem, and the fact that X,xX...XX,, is 
homeomorphic to X, X(X,X... XX,). i 


The following topic is included as an excursion. More properties of countably 
compact spaces are explored in the section exercises. 


Definition. A topological space X is said to be countably compact if every 
countable open cover of X contains a finite subcover. 


Example 6. A topological space X is countably compact if and only if, for every 
descending sequence F, D F, D ... of nonempty closed sets, NF) F,, # ©. 


Suppose X is countably compact. If Nf) F, = @, then X = U%)(X— F,,). The 
countable compactness assumption and the fact that the sequence X — F,, is 
ascending imply that X = X — Fy for some integer N. This would force Fy = @, 
which is a contradiction. 

To prove the converse, suppose that {U,,} is an open cover of X, and for 
né&N, define V,, = U/L, Uj. Finally define F,, = X— V,,. Then F, D F, 2 ..., and 
Np F, = © because {V,,} covers X. It follows that F,, = @ for some integer n, 
hence X = V,, = Ui, Uj. 


Exercises 


1. Show that the union of a finite number of compact subsets of a topological 
space X is compact. 

2. Verify that the proofs of theorems 5.8.1 through 5.8.5 are those included for 
the corresponding theorems in section 4.7, without alteration. 

3. Let X be a compact Hausdorff space, and suppose there exists a countable 
set of continuous functions f,, : X > [0,1] such that, for every pair of 
distinct point x and y in X, there exists a function f,, such that f,(x) #f,(y). 
Prove that the function d(x,y) = yo fale) —fa(y)| is a metric and 
that it induces the topology on X. 

4, Let X be a compact space, and let F, D F, D ... be a descending sequence 
of nonempty closed subsets of X. Prove that N?_| F,, # ©. 
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11. 
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13. 


14. 


15. 
16. 
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. Prove that a compact Hausdorff space X cannot be expressed as a countable 


union of (closed) nowhere dense subsets {A,,}. 


. Let J, and J be topologies on the same set X such that J, is Hausdorff and 


5, is compact. Prove that if J, C %, then F = J. Conclude that if T is a 
compact Hausdorff topology, then any strictly larger topology than J is 
not compact, and any strictly smaller topology than J is not Hausdorff.* 


. Let X be a topological space, and let 8 be an open base for X. Prove that X 


is compact if and only if every open cover of X by members of B has a finite 
subcover. The same result is true for open subbases, but it is considerably 
harder to prove. 


. Prove that if X is a compact space and Y is a Lindeldf space, then X x Y is 


Lindel6f. 


. Prove that any two disjoint compact subsets of a Hausdorff space have 


disjoint open neighborhoods. 

Let K be a collection of compact subsets of a Hausdorff space X, and let U 
be an open subset of X such that N{K : K € K} C U. Prove that U contains 
the intersection of a finite subcollection of K. 

Let X be a Hausdorff space. Prove that if K, D K, D ... is a sequence of 
descending nonempty compact subsets of X, then NP) K,, # @. 

Prove that the continuous image of a countably compact space is countably 
compact. 

Prove that a closed subspace of a countably compact space is countably 
compact. 

Prove that a countably compact metric space is compact. 

Prove that a second countable, countably compact space is compact. 
Verify that the proofs included in section 4.9 for theorems 4.9.3 (the Stone- 
Weierstrass theorem) and 4.9.6 are valid without alteration when X is a 
compact Hausdorff topological space. 


5.9 Locally Compact Spaces 


Without a doubt, R” is the most important example of a locally compact Hausdorff 


space. 


We studied locally compact metric spaces briefly in section 4.7. In this sec- 


tion, we will see that locally compact Hausdorff spaces are regular (theorem 5.9.3); 


hence they have good separation properties. They are also very nearly normal. 
Compare theorems 5.9.2 and 5.6.3. The next section is the natural continuation 
of this one, where we show that every locally compact Hausdorff spaces can be 
embedded into a compact Hausdorff space in a special kind of way. We will take 


* This property is sometimes described to as the rigidity of compact Hausdorff topologies. 


ESSENTIALS OF GENERAL TOPOLOGY 227 


another journey into locally compact spaces in section 5.11, where we establish 
Urysohn’s theorem for locally compact Hausdorff spaces and introduce the space 
of continuous, compactly supported functions on such spaces. 

This section is the transitional section to the remaining three sections in this 
chapter. It may be bypassed on the first reading of the book because locally compact 
metric spaces (section 4.7) are sufficient for most of the rest of the book. Locally 
compact Hausdorff spaces are needed only in sections 8.4 and 8.7, where frequent 
reference is made to the results in this section and sections 5.10 and 5.11, and where 
certain theorems are extended from R” to locally compact Hausdorff spaces. 


Definition. A topological space X is locally compact if, for every x € X, there 
exists an open set V such that x € V and V is compact. Thus every point is in 
the interior of a compact set. 


We established in section 4.7 that R” is locally compact and that J is not. See 
theorem 6.1.5 for a far-reaching result. Also in section 4.7, we showed that Q is 
not locally compact. 


Theorem 5.9.1. Let X be a Hausdorff space. Then X is locally compact if and only 
if, for every x € X and every open neighborhood U of x, there exists an open 
neighborhood V of x such that V is compact and V C U. 


Proof. Suppose X is locally compact, and let x and U be as in the statement of the 
theorem. Let K be a compact subset of X that contains x in its interior, and 
let F=K—U. As F is a closed subset of the compact subset K, it is compact. 
Invoking theorem 5.8.9 yields disjoint open sets W, and W, such that x € W, and 
FC Wy. Define V= W, Nint(K). Since K is compact and V CK, Vis compact. 
Finally, since VG X — W3, and the latter set is closed, VCX—W,CX-—F. Thus 
VC Kn(X—F)=K-—FCU. The proof of the converse is trivial. 


The next result generalizes the last. 


Theorem 5.9.2. Let X be a locally compact Hausdorff space, and let U be an open 
neighborhood of a compact subset K of X. Then there exists an open neighborhood 
V of K such that V is compact and V C U. 


Proof. By theorem 5.9.1, every point x € K has an open neighborhood V,, with 
compact closure such that V,CU.z Since K is compact and K C UyexVy, KE 
Ui=1 Vy,» for some finite subset {x),...,x,} of K. The open set V = Uj, Vx, has the 
desired properties. Mi 


The following result is a direct consequence of theorems 5.6.2 and 5.9.1. 
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Theorem 5.9.3. A locally compact Hausdorff space is regular. 


Theorem 5.9.4. Let X be a second countable locally compact Hausdorff space. Then 
X is a countable union of compact subsets of X. 


Proof. Let 8 be a countable open base for X. For every x € X, there is an open set 
V,, such that x € V, and V,, is compact. Let B,, € B be such that x € B, C V,. 
Clearly, B,, C V,,; thus B,. is compact. Now X = UyexB,. Since B is countable, only 
countably many of the sets B,, can be distinct, showing that X is a countable union 
of compact subsets of X. 


Definition. A topological space X is said to be c-compact if it is the countable 
union of compact subsets. 


For example, R” is c-compact. More generally, the above theorem states that a 
second countable locally compact Hausdorff space is o-compact. 


We will use the following result in the next section to prove a simple character- 
ization of locally compact Hausdorff spaces. The proof is left as an exercise. 


Proposition 5.9.5. An open subspace of a locally compact Hausdorff space is locally 
compact. 


aA WN 


Exercises 


. Prove proposition 5.9.5. 

. Prove that a closed subspace of a locally compact space is locally compact. 

. Prove that the product of two locally compact spaces is locally compact. 

. Prove that a second countable locally compact Hausdorff space is normal. 

. Let fbe a continuous, open mapping from a locally compact space X onto a 


topological space Y. Prove that Y is locally compact. 


. Prove that if E and F are compact subsets of a locally compact Hausdorff 


space X, then E and F have disjoint neighborhoods with compact closures. 


. Prove that a compact subspace of the Sorgenfrey line R; is countable. 


Conclude that R; is not locally compact. Hint: Let K be compact in R;, and let 
x € K. Clearly, R = UR|(—co,x— = U[x,0o). Let n be the least positive 
integer such that K€ (—oo,x— ~)U[x, 00). Set a, =x—1/n. Clearly, 
(a,,x] NK = {x}. Show that if x and y are distinct points of K, then 
(a,, x] a (a,,y] = ©. 
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5.10 Compactification 


In this section, we show that a locally compact Hausdorff space (X,7) can be 
embedded in a compact Hausdorff space (X,,,J,,) in the manner described in 
theorem 5.10.1. In that theorem, the definition of the topology J, requires some 
explanation. 


The prototypical and most important example of a locally compact Hausdorff 
space is R”. We focus here on R?, because the stereographic projection of the 
punctured sphere S? onto R? is easy to visualize and provides an excellent 
motivation for the the definition of J,. The stereographic projection has been 
known to mapmakers since the late sixteenth century, and it is reasonable to 
surmise that Alexandroff was aware of that projection when he invented the 
topology J,. 


It is clear that a compactification of the plane (more literally, its homeomorphic 
image S2) is the compact sphere S’, which contains S2 and a single additional 
point N. Some reflection reveals that there are two types of open subsets of the 
compact sphere: 


(a) The open subsets of S? that do not contain N: These are in one-to-one cor- 
respondence (through the stereographic projection) with the open subsets 
of the usual topology of R?. 

(b) The open subsets U of S? that contain the point N: The complement 
K = S’ —U of such an open set is closed in S?. Since S? is compact, K is 
compact. Thus the open sets U of this type are exactly the complements 
of compact subsets of the punctured sphere, which are in one-to-one 
correspondence with the compact subsets of R?. 


The above discussion suggests that a likely construction of a compact topology that 
contains the usual topology on R? can be obtained by adding a single point, which 
we call co (this point corresponds to the point N on the compact sphere), to R? 
and define the topology on R? U {oo} to consist of the above two types of sets. This 
is exactly how the topology JH, in theorem 5.10.1 is defined. 


Theorem 5.10.1. Let (X,7) be a locally compact Hausdorff space that is not 
compact. Then there exists a compact Hausdorff space (X,,, Iz) containing (X, 7) 
such that 


(i) X,, —X is a single point, 
(ii) T is the restriction of J,, to X, and 
(iii) X is dense in (X.,,Ic9). 
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Proof. Take an object, which we give the symbol oo, that does not belong to X, and 
let X.. = XU {oo}. 
We define J, to be the collection of subsets of X.. of one of the following two 
types: 


(a) all the members of J, or 
(b) subsets of X,, of the form X,, — K, where K is a compact subset of X. 


We claim that J,, is a topology that satisfies the stated properties. We leave it to 
the reader to verify that the intersection of two members of J,, belongs to J,,. To 
show that the union of an arbitrary subcollection of Jz. is in T,, we work out three 
cases: 


1. Since J is a topology and I C J,,, the union of open sets of type (a) is in F,,. 

2. Consider the union of a collection {X,, —Ky}q of subsets of X,, of type (b); 
Ual(Xoo — Ka) = Xo9 — NgKa, which is in To, because NyKg is compact in X. 

3. Consider the union of a subcollection {Uy}q of J and a subcollection 
{X — Kg}g of To., where each Kg is compact. By cases 1 and 2 above, Ug(Ug) € 
T and Ug(X. — Kg) € Jo.. Write Ug(Ug) = U, and Ug(Xoo — Kg) = Xoo — 
K. Now Ug (Ug) U Ug (Xoo — Kg )=UU (Xq. — K) = Xqq — (K — U), which is in 
J, because K— U is compact in X. This proves that J,, is a topology. 


We verify that J is the restriction of J,, to X. Given an open subset of X,,, its 
intersection with X is in TJ since, for an open subset U of X, UNX = Uand, for a 
compact subset K of X, (Xj. —K) N X = X— K, which is open in X. The converse 
is trivial since, for an open subset U of X, UN Xq = U. 


Next we show that X is dense in X,, by showing that every open neighborhood of 
oo intersects X. Such a neighborhood is of the type X,, — K, where K is a compact 
subset of X. Since X is not compact, X — K # @, and clearly (X,, —-K) Nn X#@. 


We now show that J, is Hausdorff. It is sufficient to show that ifx € X, then x and 
oo have disjoint open neighborhoods in X,,. Since X is locally compact, there is a 
compact subset K of X such that x € int(K). Let U=int(K), and V=X,, —K; U 
and V are disjoint neighborhoods of x and cv. 


Finally, we show that X,. is compact. Let U be an open cover of Xo; U must 
contain a member of the type Xj —K for some compact subset K of X. Let 
u' be U with the exclusion of X,, — K. The intersection of X and the members 
of U' clearly covers K. Thus there exists a finite subcollection U" of U' such 
that K CUIXNW: We U"}. Since X,, = KU (X,, — k), the finite subcollection 
U" U{X,, — K} of U covers X,,. 
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Definition. The topological space (X,.,J,,) we constructed in theorem 5.10.1 is 
called the one-point (or Alexandroff) compactification of the locally compact 
Hausdorff space X. 


Theorem 5.10.2. The one-point compactification of a locally compact Hausdorff 
space X is unique up to homeomorphism. More specifically, if Y is a compact 
Hausdorff space and p : X > Y is a topological embedding such that Y — p(X) 
is a single point, then y can be extended to a homeomorphism (,, : Xo > Y. 


Proof. Let Y — p(X) = {w}, and extend @ to a function ~,, : X,, > Y by defining 
Poco) =a. Trivially, Po. is a bijection. Since both X,, and Y are compact 
Hausdorff spaces, we need only to show that p;,' is continuous; see theorem 5.8.7. 
Equivalently, we show that @,, is an open mapping. If V is an open subset of X, 
then @o(V) = p(V), which is open by the assumption that p is an embedding. If 
V contains co, then V = X,, — K for some compact subset K of X. Now 9,,(V) = 
Poo(Xog — K) = Y— $.(K) = Y— P(K). The compactness of K together with the 
continuity of p imply that p(K) is compact. By theorem 5.8.3, p(K) is closed, and 
Y— 9(K) is open. 


Example 1. Let v be the chordal metric on R. Recall that (R, v7) is homeomor- 
phic to R with the usual topology. Therefore the one-point compactification 
of R with respect to the usual topology is homeomorphic to the one-point 
compactification of (R, 7). By the very definition of the chordal metric, (R, v7) 
is homeomorphic to the punctured sphere S' — {N}. Therefore the one-point 
compactification of (IR, 7), hence (R,J), is the compactification of S!—N, 
which is clearly S'. We have arrived at the following result: the one-point 
compactification of R is the circle. The same argument shows that the one-point 
compactification of the complex plane C (identified with the Euclidean plane 
IR’) is the sphere. This is the reason the sphere is thought of as the extended 
complex plane and is often called the Riemann sphere. 


Example 2. The one-point compactification of the open interval (0, 1) is the circle 
S'. To see this, recall that the unit (0, 1) is homeomorphic to the line R. Since 
the one-point compactification of R is S', the compactification of (0, 1) is S!. @ 


Example 3. While the one-point compactification X,, of a locally compact Haus- 
dorff space X is essentially unique, it is possible to embed X as a dense subspace 
of other spaces not homeomorphic to X,,. For example, another compactifi- 
cation of the open unit interval (0,1) is the closed interval [0,1], which is not 
homeomorphic to S!. @ 
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Example 4. The one-point compactification of the punctured line, X = (—00,0)U 
(0, co) is homeomorphic to the union of two externally tangent circles (a figure 
eight). 


Each of the open half lines is homeomorphic to an open half circle, as shown in 
figure 5.1(a). There are several ways to see this. The stereographic projection 
is the easiest to visualize. The next step is to pull the two open half circles 
horizontally apart a distance equal to the diameter of each half circle, as shown 
in figure 5.1(b). Now each half circle is homeomorphic to a punctured circle. 
For example, function f(e) = e? maps the half circle fe : —7/2 < 0 < 2/2} 
onto the punctured circle fe? : —7 < @ < zr}. Hence X is homeomorphic to the 
union of the two tangent punctured circles shown in figure 5.1(c). If we define 
the point at infinity to be the missing point of tangency, we obtain the figure 
eight shown in figure 5.1(d). # 


The following succinct characterization of locally compact Hausdorff spaces fol- 
lows directly from theorems 5.9.5 and 5.10.1. 


Theorem 5.10.3. A Hausdorff space is locally compact if and only if it is an open 
subspace of a compact Hausdorff space. 


(a) (b) 


OO Co 


(c) (d) 


OO) OO 


Figure 5.1 
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Exercises 


1. Let X = N with the restricted topology induced by the usual topology on R. 
This topology is, in fact, the discrete topology on N. Prove that the one-point 
compactification of X is homeomorphic to the space X,, = ct :nENu {0} 
(as a subspace of R). " 

2. Prove that the one-point compactification of R” is the sphere S”. 

. What is the one-point compactification of the open unit disk in R?? 

4. By generalizing the idea of example 4, make a conjecture about the one-point 
compactification of the union of the two open half planes {(x,y) € R? : x > 
O}U {(x,y) € R? : x < Of. 

5. Prove that if a locally compact Hausdorff space X is second countable, then 
so is Xq. 


Ow 


5.11 Metrization 


We now turn to the question of which topologies are induced by a metric. Theorem 
5.11.3 is the main result in this section. Although it is not the best known result, 
it does establish sufficient conditions for metrization. The proof techniques we 
develop along the path to theorem 5.11.3 are elegant and important in their own 
right. We first state the following definition. 


Definition. A topological space (X,7) is metrizable if there is a metric d on X 
that induces the topology J. 


Lemma 5.11.1. Suppose X is a normal space, and let E and F be disjoint closed 
subsets of X. Let C be the set of rational points in the interval [0,1]. Then there 
exists a countable collection of open subsets {U, : p © C} such that 


if p,q € Cand p <q, then ine Un. (*) 


Additionally, for allp EC, EG U,, and U, CX—-F. 


Proof. Let pp =0, and p, =1, and let {p ,p3,p4,...} be an enumeration of the 
rational point in (0,1). Since EC X—F, theorem 5.6.3 yields an open set U, 
such that EC U, CU, © X—F. Another application of theorem 5.6.3 yields an 
open set Up such that EC Up C Up € Uj. The rest of the construction is inductive. 
Suppose that, for each element p; of the finite set C, = {po, ...,Pn}, we have found 
an open set Up, such that the sets U,,,...,Up, satisfy condition (*) for p,q € Cy. 
Consider the rational number p,,,1. It must fall strictly between two elements of 
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Cy» SAY, Pi < Pn4i < p;- Again by theorem 5.6.3, there exists an open set U, , such 
that U,, OU 5: nS (com Cc Up. By construction, the sets Uscerna Up sg satisfy 
condition (*) for p,q € C,4,. Since, for every pair of points p and q in C, there 
is a finite set C,, that contains p and q, the proof is complete. 

The inclusions E C U,, and U, C X—F for all p € C are obvious since E € Up 


and U,C X—F.@ 


Remark. Remark. Any dense subset C of [0,1] containing 0 and 1 can be used in 
the construction of the collection {U, : p € C}. A commonly used such set is 
the set of dyadic rational numbers D = {0, 1, 1/2, 1/4,3/4, 1/8, 3/8, 5/8, 7/8, ...}, 
which is slightly more advantageous in the visualization of the construction of 
the sets {U,}. 


The following theorem is crucial for the proof of theorem 5.11.3. It is greatly 
important in its own right. 


Theorem 5.11.2 (Urysohn’s lemma). Suppose that X is a normal space and that 
E and F are disjoint closed subsets of X. Then there exists a continuous function 
f: X — [0,1] such that f(E) = 1, and f(F) = 0. 


Proof. Let C and {U, : p € C} be as in lemma 5.11.1. For p,q € C, define 


Since fy = PxXu,_, and g,=qtQA— Dx. Sp is lower semicontinuous and g, is 
upper semicontinuous by theorem 5.3.4 (U,_, is open, and U4 is closed). Now 
define 

fF Suppect fp} and g = infgectg,}- 


Again theorem 5.3.4 implies that f is lower semicontinuous and that g is upper 
semicontinuous. 

Ifx EE, x € Uj_,, for every p € C, and hence f(x) = suppectp} = 1. Ifx € F 
then x € U\_, for every p € ©, and hence f(x) = 0. 


By theorem 5.3.4, the proof will be complete if we show that f = g. 


> D={0, Yuurats > k= 1,3,5,..,2"—1}. 
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We claim that, for all p,q € C, f, < g,- It follows immediately that f < g. 
If, for some x € X, g4(x) < f(x), then f,(x) > 0, and g,(x) < 1. Thus 


fox) =p, hence x E U\_», and 
& (x) = 4, hence x€ Uy 


Now p=f,(x)>g,(x) =; hence 1—p<1—q, and Cis CU,_,. This 
contradicts x € U,_, and x ¢ ics and establishes the claim that f, < g,. 


Suppose, for a contradiction, that f(x) < g(x) for some x € X. Because C is dense 
in [0,1], there are points p,q © C such that f(x) < p<q< g(x). Now f(x) <p 
implies that x € U,_», and g(x) > q implies x € hey This is a contradiction 
because 

1—q<1—p;hence U,_ q © Uj~p. The contradiction concludes the proof. 


Theorem 5.11.3 (the Urysohn metrization theorem). Every regular second 
countable topological space (X,J) is metrizable. 


Proof. According to theorem 5.7.4, X is actually a normal space. Let 8 = {B,, By, ...} 
be a countable open base for the topology. If x € B,, then, by theorem 5.6.2, and 
the fact that B is an open base, there exists a basis element B, € B such that 
x © B,C B,C By. Therefore the collection of pairs P= {(By,Byn) : B, © Bm} is 
not empty. Since P is countable, we enumerate P as follows: P= {(By,,Byn,) iE 
N}. By theorem 5.11.2, for each i € N, a continuous function f, : X > [0,1] exists 
such that f(B,,) = 0, and f(X —B,,,) = 1. 

We now define a metric d on X as follows: 


foe} ‘ =F 2,1/2 
ty) ={ 3 MeO=HODEY" 


{2 
i=1 : 


It is clear that series in the above definition converge since Ui). —fi(y)| <1. In 
fact, the sequences 9, = (f; (9, 22....,4 He) ... and, =(f, (), 2 2 fi) .) 


are in P and d(x, y) is nothing but the ? tats between ,. and 0). It nae 
clear that d is a metric once we show that the function x + 9, is an injection. Let 
x and y be distinct elements of X, and let U be an open neighborhood of x that 
excludes y. Choose a basis member B,,, such that x € B,, © B,, C U, then choose a 
basis member B,, such that x € B,, CB, CB,,. The pair (B,,,B,,) € P, and hence 
(B,,, Bm) = (Bn,»Bm,) for some i EN. It follows that f,(x) = 0 and f,(y) = 1. 

We now show that the metric d induces the topology J. 
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We prove that every d-open subset U of X is J-open. Let x € U, and let r>0 
be such that B(x,r) C U. Here B(x,r) is the d-ball of radius r centered at x. 
We will show that there exists a J-open set V such that x € VC B(x,r). First 

ue 
i=N+1 2 
there exists an open neighborhood V; of x such that, for every y € V;; | f(x) — 


AP < = We claim that the set V=MjL,V; is the set we seek. If y € V, 


© fi(fi)? if)? Lil —fi)? 
then [d(x,y))? = pane + = =e 1 j2 : + pa =N+1 i2 < 
PON 1 ee) 1 Pr 

ele Peemg es 


ae : . : 
choose an integer N such that ye <F: Since each f; is continuous, 


To show that every J -open set is d-open, it is sufficient to show that every basic 
open set B,, is d-open. Let x € B,,. We need to show that there exists r > 0 such that 
B(x, r) CB, By theorem 5.6.2, there exists a basis element B, such that x € B,C 
B, CB,,. Now (B,,,Bm) € P, say, (By, By) = (By, Byy,). Let fa > fy € BG, r), 
(xf) 
then yi pica < =. In particular, eo < =. ae LF) — fily)| < 
oe J 1 t L 


-. Because f(x) = 0, |fi(”)| < =. Since f(X— By) =1,y € By, = By, 


The conditions of theorem 5.11.3 are not necessary for a space to be metrizable. 
For example, the space J® is metrizable but not second countable. However, if we 
limit ourselves to compact Hausdorff spaces, the conditions of theorem 5.11.3 are 
necessary as well as sufficient, as the next theorem shows. 


Theorem 5.11.4. A compact Hausdorff space is metrizable if and only if it is second 
countable. 


Proof. A compact Hausdorff space X is normal by theorem 5.8.10. Therefore if X is 
second countable, it is metrizable by the previous theorem. To prove the converse, 
recall that a compact metric space is second countable (see problem 7 on section 
4.7 and theorem 4.5.1). @ 


We now venture back into locally compact Hausdorff spaces. The following theo- 
rem is the closest analog of theorem 5.11.2 for locally compact Hausdorff spaces, 
which need not be normal. It is sometimes referred to as Urysohn’s theorem for 
locally compact spaces. 


Theorem 5.11.5. Let X be a locally compact Hausdorff space, and let K and F be 
disjoint subsets of X such that K is compact and F is closed. Then there exists a 
continuous function f : X — [0,1] such that f(K) = 1, and f(F) = 0. 
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Proof. Applying theorem 5.9.2 to the compact set K and the open set X — F, there 
exists an open subset V with compact closure such that KC VC VCX-—F. 
Applying theorem 5.9.2 again to the compact set K and the open set V, we can 
find an open set U with compact closure such that KG UC UC V. Now the 
subspace V with the restricted topology is a compact Hausdorff space and is 
therefore normal by theorem 5.8.10. Applying theorem 5.11.2 to the closed subsets 
KandV— Uof V, there is a continuous function f : V = [0,1] such that f(K) = 1, 
and f(V— U) = 0. Extend f to a continuous function f : X — [0,1] by defining 
F(x) = 0, for all x € X—U. Problem 4 on section 5.3 is relevant here to show the 
continuity of the extended f. ™ 


Definition. Let f be a complex-valued function on a topological space X. The 
support of f, written supp(/), is the closure of the set {x EX : f(x) #0}. A 
function f : X — C is said to have compact support if supp(f) is compact. The 
set C.(X) of continuous, compactly supported functions is clearly a subspace 
of BC(X). 


The following corollary is of pivotal importance in studying a certain class of 
measures on locally compact Hausdorff spaces. In section 8.4, we present the 
main examples of such a measure: Lebesgue and Radon measures. 


Corollary 5.11.6. Suppose that X is a locally compact Hausdorff space, that K is a 
compact subspace of X, and that V is an open neighborhood of K. Then there exists 
a continuous function of compact support, f : X — [0,1], such that f(K) = 1, and 
supp(f) C V. 


Proof. Apply theorem 5.9.2 to find an open set U with compact closure such that 
KCUCUCV. Now apply theorem 5.11.5 to the sets K and F= X — U to find 
a function f : X > [0,1] such that F(K) =1 and f(X—U) =0. Observe that 
supp(f) C U, which is compact. 


Definition. Let X be a locally compact Hausdorff space. A continuous, scalar- 
valued function fis said to vanish at co if, for every € > 0, there exists a compact 
subset K of X such that | f(x)| < € for every x € X — K. The set C)(X) ofall scalar- 
valued functions on X that vanish at oo is clearly a vector space. 


Theorem 5.11.7. A function f € Co(X) is bounded and the space Co(X) is a complete 
normed linear space under the supremum norm. 


Proof. We leave it to the reader to show that C)(X) € BC(X). We prove that Co(X) is 
closed in BC(X). Let f € BC(X) be a closure point of Cy(X), and let € > 0. There 
exists a function g € C)(X) such that || f— || < €/2. Let K be a compact subset 
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of X such that |g(x)| < €/2 whenever x € K. Now if x € K, then | f(x)| < | f(x) — 
g(x)| + |g(x)| <¢/2+¢/2. 0 


Theorem 5.11.8. Suppose X is a locally compact Hausdorff space. Then C(X) is 
dense in Co(X). 


Proof. Let g € C,(X), let €>0, and let K be a compact subset of X such that 
|g(x)| < € for x € X — K. By theorem 5.9.2, there is an open subset V with compact 
closure such that K € V. By corollary 5.11.6, there exists a function f € C(X) such 
that f(K) =1, 0 <f(x) <1, and supp(f) C V. The function fg is in C(X) and 
Il —fglloo < €. Ml 


Exercises 
In the problems below, X is a locally compact Hausdorff space. 


1. Prove that C)(X) C BC(X). 

2. Prove that f € C)(X) ifand only if, for every € > 0, the set {x € X : |f(x)| = ¢} 
is compact. 

3. Prove that f€ Co(X) if and only if fis the restriction to X of a function 
g € C(X,,) such that g(oo) = 0. Here X,, is the one-point compactification 
of X. 


5.12 The Product of Infinitely Many Spaces 


This section generalizes section 5.4. First we review some terminology and 
notation. 


Let {Xg}qer be an arbitrary collection of nonempty sets. The Cartesian product 
X= TeerXa is the set of all functions x : I> Uge;Xq such that, for every a € I, 
x(a) € Xz. We write xz instead of x(a), and we denote an element of X by 
x=(Xg)eep or simply x = (x,). For a fixed a €J, the projection of X onto the 
factor set X_ is the function 7,,(x) = xy. 


Let {(Xq ,Ja)}wer be a collection of topological spaces, and let X = I1.%a be 
the Cartesian product of the underlying sets. As in the definition of the product 
topology in section 5.4, we would like the product topology to guarantee the 
continuity of all the projections 77, : X > Xz. One might be tempted to adopt the 
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following simple generalization of the product of finitely many spaces. Consider 
the topology J, which has the following subbase: 


{[] Ua: «E10, € Jet 


ael 


It would be a hasty decision to define J to be the product topology. Although 7 
certainly guarantees the continuity of all the projections, it is too wasteful because, 
in order to guarantee the continuity of 77,2, we only need the openness of sets of 
the form 777 !(U,,), where Uz, € Jy. A little reflection shows that 


Ta (Ux) = Ug xX Ieee Xe: ? 


Therefore the smallest topology which guarantees the continuity of all 
the projections is the topology whose subbase is the collection {7g '(Uz) : a € 1, 
Ug © Te3- 


We now formalize the above motivation to define the product topology 
GS ={mz (Uz): EL Uy €EIyh. 


Since U{S : SE G}=X, theorem 5.2.3 applies, and the following definition is 
meaningful. 


Definition. The product topology on X is the weakest topology that contains ©. 


By construction, © is a subbase for the product topology (theorem 5.2.3). 
An open base & for the product topology on X consists of finite intersections 
of members of ©. Thus a typical member of B is a set of the form 


Cee Ue HU Ke KU | Xe 


a4; 
where {c,...,@,,} is a finite subset of I. 


To reiterate, the above set is the set of all x € X such that 7,,(x) € U; for all 
l<i<n. 

The following theorem is a restatement of the definition of the product 
topology. See the proof of theorem 5.4.1. 


Theorem 5.12.1. The product topology is the weakest topology relative to which all 
the projections Tq, : X > Xq are continuous. 


® The set 77z!(U,) is the set of all elements x in X such that xy € Ug and the other coordinates, XB> 
of x are unrestricted elements of Xg. This is exactly the set Uy x ]] pea XB- 
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Example 1. The product of an arbitrary collection {X, : a € } of Hausdorff 
spaces is Hausdorff. 
Let x = (xq) and y = (yq) be distinct elements of |] «Sa: Fix an element a € 
I for which xg # yq. Since Xy is Hausdorff, there exist open neighborhoods Uy 
and Vy, of xy and yy, respectively, such that Uy N Vz = @. Now 77z!(U,,) and 
1x '(Vq) are disjoint open neighborhoods of x and y, respectively. @ 


Example 2. Let {X, : a € }beacollection of connected spaces, let X = ie ee xa 
and let g : X > {0,1} be continuous. If x,y € X are such that xg = yq except 
for a finite subset F of I, then g(x) = ¢(y). 


Define a function j : [[.-,,Xq — Xas follows : z+ j,, where 


acer 


; Zq ~ifaeF, 
jihad) = : 
Ya ifaGF. 


For each 6 € I, the function goj : |] ,¢.Xa > Xp is equal to 7 if 8 € F, and 
it is constant if 6 ¢ F. By problem 6 at the end of this section, the function j is 
continuous. Now [],-;Xq is connected by theorem 5.5.5; hence j([ |, --Xq) 
is connected by theorem 5.5.3. Since x and y are in (J ,-,-X a)» 9) = 9(). # 


Example 3. If{X, : a € I}is a collection of connected spaces, then X = |] -,X« 
is connected. 
Let g : X > {0,1} be a continuous function, and suppose that U= g~1(0) # 
@. We show that g(X) = {0}. Fix an element a= (a,,) € U, and let x E X be 
arbitrary. Since U is open in X, there is a basis element B = N/7q,'(Ug,) such 
that a € B C U. Define an element y € X as follows: 


aq ifa=a,, 


Xa ifa x Xj. 


Then y € BC Uand ¢(y) = 0. Since xg = yq for all a # a, p(x) = p(y) = 0 by 
example 2. @ 


Theorem 5.12.2. If for each a, Fy is closed in Xz, then ||, Fy is closed in X. 


Proof. Let Uy =Xq—Fy. We claim that |] ,-)Fa =X —UgMq (Uy). The result 
follows since Ugaxz'(U) is open in X. Let x = (xq) be an element of X. Now 
x€X—]], Fy ifand only ifx € |], Fo. ifand only if xq € Fy for some a € Lif 
and only if xy € Ug for some a € I, if and only if x € Ugaq'(Uq). 
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Theorem 5.12.3. For each a € I, let By be an open base for Xz. Then the family of 
subsets of X of the form NwerMq (Ba), where F ranges over finite subsets of I and 
By € By; is an open base for the product topology. 

The proof is left as an exercise. 


Example 4. The product of a countable collection of second countable spaces is 
second countable. 
This follows directly from the previous theorem. Indeed, when I is countable 
and, for each a € I, By is countable, then the family Ngep7q (By), where F 
ranges over finite subsets of I and By € Bg, is countable. 


We need the following lemma before we tackle Tychonoff’s theorem. 


Lemma 5.12.4. Let {Xq}q be a collection of topological spaces, and let X be the 
product space. If & is a collection of closed subsets of X possessing the finite 
intersection property, then there exists a family %* of subsets of X, not necessarily 
closed, which is maximal subject to the following conditions: 


(a) &* has the finite intersection property, and 


(ECS. 
Furthermore, %* is closed under the formation of finite intersections. 


Proof. Consider the family D of subsets of X containing & and having the finite 
intersection property. Order D by set inclusion, and let © be a chain in D. We 
will verify that U{C : C € CG} is an upper bound on G. Let F,,...,F,, be members 
of U{C : C € Ch. Then there are members C,,...,C,, of © such that F; € C;. Since 
© is a chain, one of the families C,,...,C,, say, C;, contains all the others. Now 
all the sets F,,...,F, are in C,; hence Nj, F; # ©, and U{C : C € C} has the 
finite intersection property. Clearly, C contains &. By Zorn’s lemma, D contains 
a maximal member %*. If there are sets F, and F, in %* such that F; NF, ¢ 
@ > then &* ULF, N Fy} would have properties (a) and (b), which contradicts 
the maximality of %*. This proves that the intersection of two (hence any finite 
number of) sets in %* is in 3". Ml 


Theorem 5.12.5 (Tychonoff’s theorem). Let {(X_,Iq)}aey be a collection of com- 
pact topological spaces. Then the product space X is compact. 


Proof. Let & be a collection of closed subsets of X that has the finite intersection 
property. We prove that N{F : F € &} #4 ©. By theorem 5.8.8, X is compact. Let 
@ bea collection of subsets of X having the properties described in lemma 5.12.4. 
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We will show that fF : F € §*}#@. This will establish the theorem because 
the members of & are closedand NF: FE ROCF: FE SL =nF : FE Fy}. 


For a fixed a € I, consider the following collection of sets: 


{1 q(F) : FE &*}. 


This is a family of closed subsets of Xz and it has the finite intersection property 
because if F,,...,F,, are in &*, then NjLy%q(F;) D Nix 7y(F;) D Ta(NjL1F;) F ©. 
Since each Xq is compact, there is an element xy € N{7q_(F) : F € %*} (theorem 
5.8.8). Let x = (xq). We will show thatx € M{F : F € §*}. Let U= Waite, (Uy,) 
be an arbitrary basic open neighborhood of x. We claim that U intersects every 
F € §*. This will show that x € F, and the proof will be complete. Since xg, € 
Ug, and xq, Tq,(F) for every F € &*, Ug, NHq,(F) # @ for every F € B*. Thus 
1x, (Ug, NF # @ for every F € §*. By the maximality of §*, it must be the case 
that q,\(Ug,) € &*. Since §* is closed under the formation of finite intersections, 
U=i 7g, (Ug,) € B*. In particular, UNF # @ for every Fe §*. a 


Example 5. (the box topology). Let {(X,,%)}qe, be a collection of topological 
spaces, and let X = [],.,Xq be the Cartesian product of the underlying sets. 
Consider the topology J whose open subbase is 


&-|TIu. : Une. 


ael 


We alluded to this topology in the section preamble. It is a well-known topology, 
although it is more intellectually curious than practically important. When each 
of the spaces X, is compact and Hausdorff, the box topology is Hausdorff, 
because it contains the product topology, which is Hausdorff by example 1. 
However, the box topology is not compact, by problem 6 on section 5.8. In fact, 
the product topology has the optimality feature of being the smallest Hausdorff 
topology on X that admits the continuity of the projections, and the largest 
topology on X that admits Tychonoff’s theorem. 


We conclude this section by showing that not every compact Hausdorff space is 
metrizable. 


Example 6. Let I be an uncountable set and, for each a € I, let X, = [0,1]. The 
space X = [0,1]! = |] ,<,Xq is compact by Tychonoff’s theorem, and Hausdorff 
by example 1. We show that X is not metrizable. 
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Let 0 denote the zero function from I to [0,1], and let A consist of all elements 
x = (xq) € X such that xy = 0 for finitely many a € I, and x = 1 otherwise. We 
show that 0 € A. Suppose B = Nj, 7g,'(Ug,) is a basic open neighborhood of 0. 
The element x = (x,) defined below is in AN B, and hence ANB# @: 


0 ifa=a,, 
1 ifa # qj. 


Xo 


We show that, for any sequence x) = (xi) in A, lim, x) # 0. The proof will 
be complete by theorem 4.2.5. Let I, be the subset of elements a € I for which 
Pa = 0. The set J = UPI, is countable since each of the sets I,, is finite. Because 
Tis uncountable, I— J # @. Pick an element 6 € I — J. By construction, i =1 
for all n € N. Now consider the open set V = 7 ([0, 1/2)); Vis a neighborhood 


of 0 that contains no terms of the sequence (x). @ 


Exercises 


1. Let {Xg}q bea collection of topological spaces, and let Ag € XQ. Prove that 
Il, Aq = Thy Aq. 

2. Prove theorem 5.12.3. 

. Prove that the product of an arbitrary collection of regular spaces is regular. 

4. Prove that the product of a countable collection of separable spaces is 
separable. 

5. Let {Xq}q be a family of topological spaces, and let X = [| , Xq- Prove that 
a sequence (x) EX converges to x € X if and only if, for each a, a(x) 
converges to 7_(x). 

6. Let {Xq}q bea collection of topological spaces, and let fbe a function from a 
topological space Y to the product space [| , Xz. Prove that fis continuous 
if and only if each of the compositions 7,o0f : Y > Xj is continuous. 

7. Let {Xghve, and {Yghwe, be two collections of topological spaces, and, 
for each a EJ, let fy : Xz > Yy be continuous. Define a function F : 
Il ee Il. Yq by F(x) = fal%a))eer- Prove that F is continuous. 

8. Prove that the product of a countable collection of metrizable spaces is 
metrizable. Hint: Let {(X,,, d,,)} be a countable collection of metric spaces. 
Without loss of generality, assume that each of the metrics d,, is bounded 
by 1 (see theorem 4.3.9). Prove that the metric 


Ow 


Ay (XnsYn) 


DY) = stp en 


244 FUNDAMENTALS OF MATHEMATICAL ANALYSIS 


induces the product topology on Te, X,,. Here x = (x,), and y = (y,), are 
elements of J] _, Xn- 
9. In the notation of the previous exercise, prove that the metric 


(x,y) = » 2d (hay) 


n=1 


also induces the product topology on Tes, Xp. 
10. In the notation of problem 8, prove that if d,, is a complete metric for every 
n EN, then D is a complete metric. 


6 


Banach Spaces 


Mathematics is the most beautiful and most powerful creation of the human 
spirit. 
Stefan Banach 


~ 


Stefan Banach. 1892-1945 


In 1902 Banach began his secondary education at the Henryk Sienkiewicz 
Gymnasium in Krakéw,’ where he graduated in 1910. He then went to Lvov 
where he studied engineering at Lvov Technical University, graduating in 1914, 
shortly before World War I broke out in August. With the outbreak of the war, 
the Russian troops occupied the city of Lvov. Having poor vision in his left eye, 
Banach was not physically fit for army service. During the war, he worked building 
roads but also spent time in Krakow, where he earned money by teaching in the 
local schools. He also attended mathematics lectures at the Jagiellonian University 
in Krakow. 


A life-changing event occurred in the spring of 1916 when Banach met Steinhaus, 
who was living in Krakow, waiting to take up a post at the Jan Kazimierz University 
in Lvov. Steinhaus and Banach wrote a joint paper, which was published in The 
Bulletin of the Krakow Academy after the war ended in 1918. From that time, 


* A European secondary school that prepares students for the university. 


Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules. 
DOI: 10.1093/0s0/97801 98868781 .003.0006 


246 FUNDAMENTALS OF MATHEMATICAL ANALYSIS 


Banach started to produce important mathematics papers at a rapid rate. On 
Steinhaus’s initiative, the Mathematical Society of Krakow was set up in 1919. The 
society later became the Polish Mathematical Society in 1920. It was also through 
Steinhaus that Banach met his future wife, Lucja Braus, whom he married in 1920. 


Banach was offered an assistantship to Lomnicki at Lvov Technical University in 
1920. He lectured there in mathematics and submitted a dissertation. This was, 
of course, not the standard route to a doctorate, for Banach had no university 
mathematics qualifications. However, an exception was made to allow him to 
submit his thesis “On Operations on Abstract Sets and their Application to Integral 
Equations.” This thesis is sometimes said to mark the birth of functional analysis. 
In his dissertation, Banach defined axiomatically what today is called a Banach 
space, a term which was coined later by Fréchet. The importance of Banach’s work 
is that he developed a systematic theory of functional analysis, where before there 
had only been isolated results, which were later seen to fit into the new theory. 


In 1922 the Jan Kazimierz University in Lvov awarded Banach his qualification to 
become a university professor, and in 1924 Banach was promoted to full professor. 
The years between the wars were extremely busy for Banach. As well as continuing 
to produce a stream of important papers, he wrote arithmetic, geometry, and 
algebra texts for high schools. In 1929, together with Steinhaus, he started a 
new journal, Studia Mathematica, and Banach and Steinhaus became the first 
editors. Another important publishing venture, begun in 1931, was a new series 
of mathematical monographs. These were set up under the editorship of Banach 
and Steinhaus, from Lvov, and Knaster, Kuratowski, Mazurkiewicz, and Sierpinski 
from Warsaw. The first volume in the series, Théorie des opérations linéaires, was 
written by Banach and appeared in 1932. It was a French version of a volume 
he originally published in Polish in 1931 and quickly became a classic. Another 
important influence on Banach was the fact that Kuratowski was appointed to 
the Lvov Technical University in 1927 and worked there until 1934. Banach 
collaborated with Kuratowski, and they wrote some joint papers during this 
period. Banach proved a number of fundamental results on normed linear spaces, 
including the Hahn-Banach theorem, the Banach-Steinhaus theorem, the Banach- 
Alaoglu theorem, Banach’s open mapping theorem, and the Banach fixed point 
theorem. In addition, he contributed to measure theory, integration, topological 
vector spaces, and set theory. 


In 1939, just before the start of World War II, Banach was elected as President 
of the Polish Mathematical Society. At the beginning of the war, Soviet troops 
occupied Lvov. Banach had been on good terms with the Soviet mathematicians 
before the war started, visiting Moscow several times, and he was treated well by 
the new Soviet administration. He was allowed to continue to hold his chair at 
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the university, and he became Dean of the Faculty of Science at the university, 
now renamed the Ivan Franko University. Life at this stage was little changed for 
Banach, who continued his research, his textbook writing, lecturing, and holding 
sessions in cafés. Sobolev and Alexandroff visited Banach in Lvov in 1940, and 
Banach attended conferences in the Soviet Union. He was in Kiev when Germany 
invaded the Soviet Union, and he returned immediately to his family in Lvov. 


The Nazi occupation of Lvov in June 1941 meant that Banach lived under very 
difficult conditions. He was arrested under suspicion of trafficking in German 
currency but was released after a few weeks. As soon as the Soviet troops retook 
Lvov, Banach renewed his contacts. He met Sobolev outside Moscow, but by this 
time he was seriously ill. Sobolev, giving an address at a memorial conference for 
Banach, said of this meeting” 


Despite heavy traces of the war years under German occupation, and despite the 
grave illness that was undercutting his strength, Banach’s eyes were still lively. 
He remained the same sociable, cheerful, and extraordinarily well-meaning and 
charming Stefan Banach whom I had seen in Lvov before the war. That is how he 
remains in my memory: with a great sense of humor, an energetic human being, a 
beautiful soul, and a great talent. 


Banach had planned to go to Krakow after the war to take up the chair of 
mathematics at the Jagiellonian University, but he died in Lvov in 1945 of lung 
cancer. 


6.1 Finite vs. Infinite-Dimensional Spaces 


This section draws some sharp distinctions between finite and infinite- 
dimensional spaces. Although some of the results in this section have intrinsic 
importance and will be used later in the book, they are collected here to convince 
the reader that infinite-dimensional spaces are truly vast compared to finite- 
dimensional ones and that a very different set of tools is needed for studying 
them. Among other results, we will see that local compactness characterizes finite- 
dimensional normed linear spaces, and that an infinite-dimensional Banach space 
cannot have a countable linear basis. 


Definition. A Banach space is a complete normed linear space. 
Examples of Banach spaces include (IK”, |].||), (C[0, 1], ||-||,o), and all spaces. 


? J. J. O'Connor and E. F. Robertson, “Stefan Banach, in MacTutor History of Mathematics, 
(St Andrews: University of St Andrews, 1998), http://mathshistory.st-andrews.ac.uk/Biographies/ 
Banac/, accessed Nov. 1, 2020. 
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Lemma 6.1.1. Let X be an n-dimensional vector space. Then there exists a norm 
||.||* on X such that (X, ||.||*) is isometric to (IK", ||.||,o). In particular, (X,||.||*) is 
complete and locally compact. 


Proof. Fix a basis {x,...,x,} of X, and define ||x||* = max,<j<c,|a;|, where x = 
eS a,x; is the unique representation of x as a linear combination of the basis 
elements. The mapping T: x++ (a,,...,4,) is clearly a linear isometry from 
(X,|[-[|") onto (", |]-[[o0)- 


Theorem 6.1.2. Let (X,||.||) be an n-dimensional normed linear space, and let |\.||* 
be the norm on X defined in lemma 6.1.1. Then there exist positive constants a 
and B such that, for all x € X, B||x||* < ||x|| < a||x||*. 


Proof. We continue to use the notation of the proof of the previous lemma. 
Let &@ =n Max, < j<n||X;||- Then 


n n n 
I|x|| = I>} axill < > lailllxil < max <i<nl|Xill >, lal 
i=1 i=1 i=1 
Sn max) <j<q\|x;|| MAX <i<n|ai| = &||x||*. 


To prove the other inequality, define a function A : (X,||.||*) > R by A(x) = ||x||- 
Now A is continuous because if lim,, ||x,, — x||* = 0, then |A(x,,) — A(x)| = |||x,]] — 
xl] < ||xp —x|| < a||x,, — x||*. Hence A(x,) > A(x). By lemma 6.1.1, (X, |].||*) is 
locally compact; hence the closed unit sphere S = {x € X : ||x||* = 1} in (X,|].||*) 
is compact (see problem 9 on section 4.7). Thus the restriction of A to S assumes a 
minimum value B = A(x) at some point x) € S. The constant B must be positive 
since, otherwise, A(x) = ||xp|| = 0, and hence xy = 0, which is not possible. We 
have shown that, for every x € S,||x|| > 8. Now, for a nonzero vector x € X, — 


sl" 
|| 2 B, and ||x|| > Bl|x||". i 


x 


S; hence || 
Ilxil* 


Corollary 6.1.3. All norms on a finite-dimensional normed linear space are equiv- 
alent. 


Proof. By theorem 6.1.2, an arbitrary norm on a finite-dimensional space is equiva- 
lent to ||.||*; hence any two norms are equivalent. 1 


Theorem 6.1.4. A finite-dimensional proper subspace F of a normed linear space X 
is closed and nowhere dense. 
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Proof. By lemma 6.1.1, F is complete and hence closed in X. To show that X is 
nowhere dense, let {x,,...,X,} be a basis for F, and let x € X — F. If F contains 


a ball of radius 5, then it would contain the ball B of radius 6 and centered at 
ox 


the origin. But then B (hence F) would contain a multiple of x, namely, sig This 
|X| 


contradiction shows that F is nowhere dense in X. @ 


Example 1. Let F be a finite-dimensional subspace of a normed linear space X, 
and let x € X— F. Then there exists a point z € F such that ||x — z|| = dist(x, F). 
Let B = B[x,r] be a closed ball centered at x, and assume r is large enough so 
that BN F 4 ©. The set K = BN F is a closed and bounded subset of F. By the 
Heine-Borel theorem (see problem 6 at the end of this section), K is compact. 
The function f: K > R given by f(y) = ||x —y|| is continuous and positive. 
Therefore d = mint f(y) : y € K} is positive and is attained at some point z € K. 
Since d < rand ||x— y|| > r for every vector y € F—B, ||x — z|| = dist(x, F). 


Example 2. The following is a direct application of the previous example. Take 
X=C([0,1], and F = P,,. For any function f € X, there is a polynomial p;, € P,, 
such that ||f— pill. = dist(fP,,). 


The polynomial p;, is the best approximation of fin P,,. It can be shown that p% is 
unique. Observe that p7, can have degree less than n. 


Example 3. For a function f€ C[0,1], the sequence of best approximations p* 
converges uniformly to f. 
Let eé > 0. By the Weierstrass polynomial approximation theorem, there exists 
a polynomial q such that ||f—4q||,, <¢. Let N be the degree of q. Then, for 
every n > N,q €P,,. Since p’, is the best approximation of fin P,, || f—Dirlloo < 
\lf-lloo < €. This shows that lim, ||f—p7||.. =0. 4 


The following theorem establishes the fact that local compactness is exclusively a 
property of finite-dimensional spaces. 


Theorem 6.1.5. A normed linear space X is locally compact if and only if it is finite 
dimensional. 


Proof. Finite-dimensional spaces are locally compact by lemma 6.1.1 and corol- 
lary 6.1.3. Now suppose X is a locally compact normed linear space. Thus the 
closed unit ball B= {x EX : ||x|| <1} is compact. Since B C U,<pB(x, 1/2), BC 
UL, B(x;, 1/2) for a finite subset {x,,...,x,} of B. We will show that {x,,...,x,} 
spans X. Let F = Span{x,,...,x,}, and suppose, for a contradiction, that there 
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is a vector x € X—F. By example 1, there is an element z € F which is closest 
to x, and d= dist(x,F) = ||x —z|| > 0. Now = € B, so there is an element 


x=—Z 


||x-z| 


i — x;|| < 1/2, and ||x — z— ||x — z||x;|| < ||x — 2||/2 = d/2. 
But z+ ||x—2||x; € F, so ||x—z—||x —2z||x,|| > d. This contradiction concludes 
the proof. 


x; © B such that || 


Remark. The last theorem implies that compact subsets ofan infinite-dimensional 
space have empty interiors and are therefore thought of as rather thin and scarce 
sets. However, compact sets continue to play an important role in the study of 
infinite-dimensional spaces. The Hilbert cube is an example of a rather exotic 
compact set. 


Example 4. The Hilbert cube H is a compact, convex, nowhere dense subset of 
I?. Furthermore, Span(JL) is dense in 1?. 


It is simple to show that Hl is closed and convex. Once we show that KH is 
compact, the above remark implies that it has an empty interior. 
foe} 
Let (Coen be a sequence in A, and write xk = Ce en The sequence 
EXO. * : ? : 1\° 
(x;),_, is bounded by 1, so there exists a strictly increasing sequence (ke) a1 


Kk! k! 
of positive integers such that x, =lim,...x,’ exists. Since the sequence (x,’) 


2 


P 


k 
is bounded by 1/2, it contains a convergent subsequence (x, yee y Let x2 = 


2 


k 
lim, x,’. Continue inductively to find sequences (kp) of positive integers 


1 
such that, for each n> 1, (is) is a subsequence of (ks) such that x, = 


n 


k 
lim, x,’ exists. Consider the diagonal sequence (kp) » and observe that it is 
Ke 
a subsequence of (ks), _ for every n > 1, thus lim, x,’ = x,. Let x = (x,), and 


Ke 
observe that |x,,| = lim, |x, |< 1/n; thus x € FC. We claim that lim, <> =x in 


P’. For simplicity of notation, write y? = x’, Let > 0, and choose an integer N 


1 ; : 
such that ye = < €7/8. Forn = 1,2,...,N, lim, Yn = x,; thus there exists an 


2 
integer P such that, for n = 1,...,N, p > P implies that |y/, — x,,|? < - Hence, 
for p > P, 


ee lv, — xn|2 < €2/2. Also, 


2\2 1 
ae lya=*al? < paar (-) a 4 nei oot e*/2. 


These last two inequalities imply that, for p > P, ||y? — x||, <. 
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Finally, we show that Span(JC) is dense in I’. Let x = (x,) € P, and let € > 0. 


Choose an integer N such that }) 44, |X|? <e€?. Define y, = (1,0,0,...), 


Y2 = (0,1/2,0,0,...)s = > Y= (0,0,...,0,1/N,0,...), and set h= yr ayyys 
where a,, = nx,. Clearly, h € Span(#L) and ||x —hl|, <<. @ 


Two very useful tools for studying finite-dimensional spaces are local compact- 
ness and the existence of a finite linear basis. We already saw in theorem 6.1.5 
that infinite-dimensional normed linear spaces are never locally compact. By 
definition, an infinite-dimensional space cannot have a finite Hamel basis. The 
following theorem should thoroughly convince the reader that a Hamel basis is 
of no practical use as a tool for studying infinite-dimensional Banach spaces. 
However, see the concept of a Schauder basis in the exercises following this section 
and the next section. 


Theorem 6.1.6. An infinite-dimensional Banach space does not have a countable 
Hamel basis. 


Proof. Let X be an infinite-dimensional Banach space, and let B = {x,,X»,..} be 
a countably infinite independent subset of X. The finite-dimensional spaces 
F,, = Span{x,,...,x,} are closed and nowhere dense in X. Baire’s theorem implies 
that Ur, F,, # X. Since Span(B) = U7, F,,, Span(B) # X. Therefore no countable 
subset of X spans X. 


The following result will be used frequently later in the book. The motivation for 
the theorem is provided below. 

Let M bea proper subspace of R”. It is an elementary fact of linear algebra (see 
problem 7 on section 3.7) that there is a unit vector x orthogonal to M. In this case, 
dist(x, M) = 1. 

Generalizing this result to Banach space is more challenging because we lack 
the concept of orthogonality, which is a available only for inner product spaces. 
The result below provides the next best alternative to the desirable property that 
dist(x, M) = 1; we can pick a unit vector x whose distance from M is arbitrarily 
close to 1. 


Lemma 6.1.7 (Riesz’s lemma). Let M be a closed proper subspace of a normed 
linear space X, and let 0<@ <1. Then there exists a unit vector x such that 
dist (x,M) > 0. 


Proof. Let v € X — M, and let 6 = dist(v,M). Since 0 < 1, there exists yy € M such 
that 5 <||v—yo|| < 5/0. Define x = ——".. For y E M, yp + ||v—yolly € M and 


llyv-yoll 
llv— Qo + Ilv—yolly|| = 6. Now 
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v— Yo 


v- —|jv- 


1 6 6 
= = = > ——_ > — = 0.5 
lv — yol| Iv (Yo + Iv yolly II = |v — yol| > 5/0 9 


Ilx— yll = II == 


1 
Ilv — yoll 


Exercises 


. (a) Prove that a sequence (x,,) in a normed linear space X is Cauchy if and 


only if lim,(x,, —x,,) = 0 for every pair (p,,) and (q,,) of increasing 
sequences of positive integers. 
(b) Show that if lim, x, =x, then lim, — ae x, =x. 


. Let w be a fixed positive function in C[0,1]. For fe [0,1], define 


IL fll = |Lfwll.o- Prove that ||.|| is a norm, and determine if it is equivalent 
to the uniform norm on €[0, 1]. 


. Let X be a normed linear space. Prove that X is a Banach space if and only 


if the closed unit ball in X is complete. 


. Let X be a normed linear space. Prove that X is separable if and only if the 


closed unit sphere in X is separable. 


. Let (x,,) be a sequence in a Banach space X such that, for every € > 0, there 


exists a convergent sequence (y,,) in X such that ||x,, — y,,|| < ¢€ foralln EN. 
Prove that (x,,) is convergent. 


. The Heine-Borel theorem. Let V be a finite-dimensional subspace of a 


normed linear space X. Show that a subset K of V is compact if and only 
if it is closed and bounded. 


. Let X be an infinite-dimensional normed linear space. Show that X contains 


a compact countable subset that is not contained in any finite-dimensional 
subspace of X. Hint: Let {x,,x,,...} be an infinite independent subset of X, 
and let €, = —*~. Consider the set {&,,}U {0}. 


nllxnll 


. Prove that, for 1 < p < oo, the linear dimension of /? is c. Hint: For each 


0 <A <1, let x, = (A,A’,/’,...). Show that the set {x, } is independent, then 
use example 7 on section 4.5. 


Definition. Let (x,,) be a sequence in a normed linear space X. We say the 
he series 1” if the sequence of partial sums S,, =) 
the series ),_, X, converges 1 q Bs ne Las] 
converges to an element x € X. We write x = )) _, X, for lim,,S,, and say 
the x is the sum of the series. We say that yee Xn is absolutely convergent 


. wo 
if Dy \|x,|| < 00. 


Xj 


. Prove that if X is a Banach space, then every absolutely convergent series is 


convergent. 
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10. Prove that if every absolutely convergent series in a normed linear space 
X is convergent, then X is a Banach space. The proof outline is as follows. 
Let (x,) be a Cauchy sequence in X. It is enough to show that (x,,) contains 
a convergent subsequence. Choose a subsequence (x,,) of (x, such that 

—k ed co 
Xue, — Xngl| < 27". Define yy = Xn,,, — Xn, Show that D7), lygl| < 00. By 
: © ro) 3 
assumption, >) ,_, Ye converges. But Ve = —Xn, + limp Xn,. 


Definition. Let X be a normed linear space. A countable subset {u,,} of X 
is a Schauder basis for X if every element x € X can be expressed uniquely 
asx = oe a,U,, where a, € K. 


11. Prove that a Schauder basis is independent. 

12. Prove that if X has a Schauder basis then it is separable. 

13. Find a Schauder basis for /?, 1 < p < o. 

14. Prove that if M is a subspace of a normed linear space X, then M is also a 
subspace of X. 

15. Let M be a closed subspace of a normed linear space X, and let x € X. Prove 
that dist(x, M) < |||. 

16. Use Riesz’s theorem to produce another proof that an infinite-dimensional 
normed linear space is not locally compact. 


6.2 Bounded Linear Mappings 


The boundedness ofa linear transformation on a normed linear space and its conti- 
nuity are used synonymously. Every linear transformation on a finite-dimensional 
space is continuous. The picture is far more complicated for linear transformations 
on infinite-dimensional spaces. In this chapter and the next, we study continuous 
linear transformations exclusively because nonlinear transformations and discon- 
tinuous linear transformations fall outside the realm of beginning linear functional 
analysis. 

In this section, we study the various equivalent characterizations of bounded- 
ness, the space of bounded linear transformations on a normed linear space, and 
the dual space in particular. The section concludes with a typical representation 
theorem, which gives a concrete description of the dual of a normed linear space. 
Throughout this section, X and Y are normed linear spaces. 


Definition. A linear mapping T : X > Y is said to be bounded if there exists a 
constant M > 0 such that for every x € X, 


I| T@|| < Mllxll- 
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Theorem 6.2.1. Let T : X > Y be linear. The following are equivalent: 


(a) T is continuous. 

(b) T is continuous at one point xy € X. 
(c) T is continuous at 0. 

(d) T is bounded. 


Proof. (a) implies (b), obviously. 
(b) implies (c). Let x, > 0 in X. Then x, +x, converges to x9. By assumption, 
lim, TX, +Xo) = T(x). But lim, T(x, +x) = lim, T(x,) + T(x); hence 


lim,, T(x,) = 0. 


(c) implies (d). Suppose T is not bounded. Then, for every n EN, there exists 
x, €X such that ||T(x,)|| > nl|x,l|- Let €, = —*-. Then lim, €, =0 in X, but 


TI] = oh > 1. Thus, lim,, T(é,) # 0 in ead T is not continuous at 0. 
(d) implies (a). Suppose that there is a constant M > 0 such that, for every x € X, 
|| T(x)|| < M||x||, and lim, x,, = x in X. Then 
lim, ||T(x,) — T(x)|| = lim, || T(x, — x)|| < lim, M||x,, —x|| = 0. 
Let T : X > Y bea bounded linear mapping. The norm of T is 


T|| = su . 


Notice that since T is bounded, there is a constant M > 0 such that ||T(x)|| < M||x|| 
Ia < M, and hence ||T]| is finite. It also follows directly 


IIx 


from the definition that || T(x)|| < ||T]||[x|I. 


for all x € X; therefore 


Example 1. Let (c,,) be a bounded sequence and, for a sequence x = (X1,x2,...) € 
I, define T(x) = (¢)x1,C2X2,...). We claim that T is a bounded linear mapping 
on P. Indeed, || TOI = Dray len%nl? $ IlellS> Lys Meal? = Hellellall3- This 
estimate shows that T(x) €/ and that ||T|| < ||c||,.- The linearity of T is 
obvious. ¢ 


Example 2. A bounded linear mapping T : X — Y maps bounded sets into 
bounded sets. 
Let A be a bounded subset of X and suppose that ||x|| <7 for every x € A. 
Then ||7(x)]l < |ITIllxll < lITIlr < 00. @ 
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Example 3. If X is finite dimensional, then every linear mapping T : X > Y is 
bounded. 

Let {x;,...,X,} be a linear basis for X, and let M = maxy<j<;||T(x;)||. Since all 
norms on X are equivalent, we may assume that the norm on X is the 1-norm. 
Thus if x= ya then ||x|| = ae |a;|. Now ||T(x)|| = ITO, axl < 
Lins lailll TI S MY; lad = Mlle. @ 


Example 4. If X is infinite dimensional, then there exists a linear unbounded 
mapping from X to Y. 


Fix a nonzero element y € Y, and let S, = {x,,x,...} be an infinite independent 
set of unit vectors in X. Let S, C X be such that S = S, US, is a linear basis for 
X. Define a function T : S > Y as follows: 


T(x) = ny ifx=Xy, 

0 ifxe S. 
Extend T toa linear mapping T : X — Y. See theorem 3.4.4. Since S; is bounded 
but T(S,) is not, T is not bounded by example 2. @ 


Theorem 6.2.2. Let T : X — Y be a bounded linear mapping. Then 


TI] = supyqcall TCI] = supyyj=rl| TOO. 


pa 
ill 
mm <M. Thus ||T|| <M. To prove that M<||T||, fix a vector x EX such 
x 
that 0 < ||x|| < 1. By definition of ||T||, ||T|| = i > ||Tx||. Since x is arbitrary, 
Xx 


it follows that M = sup) <1 ||Tx|| < ||T||, as desired. 
The proof that ||T|| = supyq;=1||T@o)|| és similar. 


Proof. Let M=supyx)<i||T(x)||. For every x €X,x #0,||T(—)|| <M, hence 


Theorem 6.2.3. Let X and Y be normed linear spaces, and let L(X,Y) be the set 
of all bounded linear mappings from X to Y. Then £(X,Y) is a normed linear 
space with the operations (T, + T,)(x) = T,(x) + T(x), (aT)(x) = aT(x), and 


T ; : . 
I ct Furthermore, if Y is a Banach space, then so is 
|X) 


the norm ||T|| = sup,zo 
L(X, Y). 


Proof. We first show that a linear combination of bounded linear mappings is 
bounded. Let a and b be scalars, and let T,,T, € £(X, Y). Then, for every x € 
X, ||aTy(x) + bT2@)II S Jalll TCO + [PIII T2@OI! S alll Tulllbeell + [Ul Talllleell = 
Clall| Till + [Bl ToIDIlal. This shows that ||aT, + bT,|| < |al||Ty|| + |O||Tal| and 
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that £(X, Y) is closed under addition and scalar multiplication. Verifying the rest 
of the axioms for a vector space is routine. 

The above inequality can also be used to verify the defining properties of 
a norm. For example, taking a=b=1 gives ||T, +To|| <||T,||+||Th||- The 
identity ||aT|| = |a|||T|| is obvious. 

It remains to show that £(X,Y) is complete if Y is complete. Suppose (T,,) 
is a Cauchy sequence in £(X,Y), and let € >0. By assumption, there exists 
a positive integer N such that, for m,n>N, ||T,—Tin||<¢. For all x EX, 
[7 y(%) — Tal] = Tn — Tr COI $ IIT — Trlllzll <ellxl]. Thus (T,(x)) is a 
Cauchy sequence in Y, and hence lim,, T(x,,) exists for every x € X. Define T(x) = 
lim,, T,,(x). We show that T € £(X,Y). The linearity of T is straightforward; 
if x,y €X, and a and b are scalars, then T(ax + by) = lim, T,(ax + by) = 
lim,, aT,,(x) + bT,,(y) = aT(x)+ bT(y). To show that T is bounded, let ¢= 
1. There is a positive integer N such that, for m,n>N,||T,—Tin|| <1. In 
particular, for all n>N,||T,(x) — Ty(x)|| < ||x||. Hence, for all x EX, and 
all nN, |IT,CI <ITq(8) — Toll +I TWCOM S llall + [ITwlllxll = + 
| Tl) ||x||. Taking the limit as n > co, we obtain ||T(x)|| < 1 + ||Ty|[)||x||. Thus 
| T|| < A +||Tyl|). Finally, we show that lim, ||T,, — T|| = 0. Let € > 0, and let N 
be such that ||T,, — T,,|| <€ for all m,n > N. For all x € X, ||T,,(x) — T,,(x)|| < 
€||x||. Taking the limit as m > oo, we have ||T,,(x) — T(x)|| < €||x|| for all x € x, 
and alln > N. Thus ||T,, — T|| < € for all n > N; hence lim, T,, = T. 


An important special case of theorem 6.2.3 is the space £(X, IX) of all bounded 
linear functionals from a normed linear space X to the base field. This space is 
known as the dual space of X , and is denoted by X*. Since KK is complete, X* is a 
Banach space, even when X is not complete. 


Another important special case of theorem 6.2.3 is the space £(X) = L(X,X) 


of bounded linear transformations on a Banach space X. Elements of £(X) are 
I7GII 


Ilxll 


also called bounded operators on X. The norm ||T|| = sup,zo is called, not 


surprisingly, the operator norm on £(X). 


Example 5. Let ||.|| and ||.||’ be norms on a vector space X. Then ||.|| and ||.||’ are 
equivalent if and only if there exist positive constants k, and k, such that, for 
every x € X, ky||x|| < ||x||’ < ky||x||. Note the contrast between this result and 
exercise 12 on section 4.3. 

Ifk, and k, exist, the two norms are equivalent by theorem 4.3.9. Conversely, 
the equivalence of the two norms implies the bi-continuity of the identity 
mapping I : (X,||.||) > (X,||.||). The continuity of I implies the existence of a 
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positive constant k, (namely, the norm of I relative to the given norms) such that 
\|x||' = |Z) ||’ < ky||x||. The existence of k, is established in a similar way. @ 


Example 6. It is easy to verify that the function ||f|| = | fll. +|f(0)| + |f)| 
defines a norm on C[0,1]. This norm is equivalent to the uniform norm on 
C[0, 1] by the previous example since ||f||.. < |IF Il < 3llf loo. # 


Example 7. Let L : ’ +P be the operator defined by L(x, x3, ...) = (29,3, -)> 
and let T,, = L”. Thus T,,(x,,%,--) = (Xp41.Xp429 +). Since ||T,,(x)||> < ||x||, for 
every x EP’, ||T,,|| <1. In fact, ||T,,|] = 1 because if we take x =e,,), then 
T,(x) = e,, and ||T,,(x%)||. = 1 = ||x||2. Next we show that lim, T,(x) =0 for 
every x € P. Indeed, ||T,,(x)|3 = D_,.4; [xxl?- Being the tail end of a conver- 
gent series, the last quantity approaches 0 as n > co. Observe that the sequence 
(T,,) does not converge in the operator norm because if it did, it should converge 
to the zero operator. This is obviously false because ||T,,|| = 1. @ 


Definition. A bounded linear operator T on a Banach space X is said to be 
bounded away from zero if there is a constant c > 0 such that ||Tx|| > c||x|| for 
allx eX. 


Example 8. Let X be a Banach space, and let T € £(X) be bounded away from 
zero. Then T is one-to-one, R(T) is closed in X, and T~! : R(T) > X is 
bounded. 


If Tx = 0, then ||x|| < || Tx||/c = 0; hence x = 0, and T is one-to-one. 

To prove that R(T) is closed, we show that if (x,,) is such that lim, Tx, = y, 
then y € R(T). It is enough to show that (x,,) is Cauchy because if we set x = 
lim,, x, then y = lim, Tx, = Tx. Since (Tx,,) isa Cauchy sequence, ||x,, — x,,|| < 
(1/c)||Tx, — Tx,,|| ~ 0as n,m — co. Thus (x,,) isa Cauchy sequence, as desired. 
Finally, the inequality || Tx|| > c||x|| implies that ||T~!|| < 1/c. @ 


We conclude this section with an example of a representation theorem. The result 
below gives a concrete characterization of the dual of the sequence spaces I. 


Theorem 6.2.4. Let 1 < p < oo. The dual (I?)* of P is isometrically isomorphic to 14, 
where ~+~=1. 
P 4 


Proof. Fix a real number p> 1. For y€11, define a functional A, : P > K by 
Ay(x) = Dnt Xnn- By Holders inequality, |Ay()| S Dims adn! S llallplly lle: 
Thus A, is a bounded linear functional on IP, and ||A,|| < |ly||q- The linearity of 
A, is clear. To show that ||A,|| = |ly|lq define a sequence (x,) as follows x, = 0 
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if Yn = 0, and X= |Yal4/Yn if Yn #0. Note that |lx\|p = |lyld” = [yl 1s hence 
x EP. Now A(x) = ys Xan = Does Wal? = Ilyllg = leg = [lal Ilyllq- 


Thus ||Ayl| = Ilyllq- 
The mapping A : 11 > (I?)* given by y + A, is clearly linear, and the fact that 
|A,Il = Ilyll_ makes A an isometry. It remains to show that A maps [4 onto (I?)*. 


Let A be a bounded linear functional on I’. We need to show that A =A, for 
some y € I1. Let e,, be the canonical vectors in K(N). For x = (x1,Xp,...) € P, the 
sequence &, = rE = (x1, X,...,X,,0,0,0,...) converges to x in IP. Let y, = 
A(e,,), and let y = (y,). First we show that y € 11. Let ny = (1,925 +++s Vn» 0,0, 0...); 
and define A,(x) = ae By the part of the theorem we already estab- 
lished, A, €(P)*, and |[Anl| = (Inllg = Qin il)". Now |A,(x)| = AE,)I S 
[AMM Enllp S AliILxllps hence ||A,|| < ||Al]. Therefore (X);_., lyil%)'"4 is bounded by 
||A||; hence pian \Yn|? < 00, that is, y € 11. Finally, we show that A = A,: 


A(x) = Adim &,,) = limAE,,) = mas He) 


i=1 


= lim }} x,A(e;) = lim )) xy; = > XnVn = A,(x). 
" j=l " j=1 n=1 


We sometimes summarize the above result by saying that the dual of IP is [4 instead 
of saying that the dual of /? is isometrically isomorphic to /?. This slight abuse of 
language is common. 


Exercises 


1. Prove that ifa linear function T : X — Y maps bounded sets into bounded 
sets, then T is bounded. 

2. Let T : X > Y be a bounded linear mapping. Show that the closed ball 
ty € Y: |lyl| < || Tl|} is the smallest ball in Y that contains the image of the 
closed unit ball, {x © X : ||x|| < 1} in X. 

3. Let T: X—Y be a bounded linear mapping. Show that ||T|| = 
sup xy<il| TOI. 

4. Show that a bounded linear mapping T : X > Yis uniformly continuous. 

. Prove that every linear functional on X is bounded if and only if 

dim(X) < oo. 

. Let A € X*. Prove that A is an open mapping. 

. Let T : X — Xbea linear mapping such that whenever x,, > 0, then {T(x,,)} 

is bounded. Prove that T is bounded. Hint: If T is unbounded, then there 


on 


NOW 


10. 


11. 


12. 
13. 


14. 


15. 


16. 
17. 


18. 


19. 
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exists a sequence (x,,) such that x,, > 0 but ||T(x,,)|| is bounded away from 
0. Consider the sequence &, = x,,/¥ ||xnll- 


. Suppose X is an n-dimensional normed linear space. Prove that 


dim(X*) =n. 


. Let T: XY be a bounded linear injection. Prove that the following 


conditions are equivalent: 


(a) Tis an isometry from X onto Y. 

(b) T(Sx) = Sy. 

(c) T(By) = By. 

Here By and By are the closed unit balls in X and Y, respectively, and Sx 
and Sy are the unit spheres in X and Y, respectively. 

We know that if 1 < p < q < ~, then I? C /4; see problem 3 on section 3.6. 
Let i : ? > [1 be the inclusion map. Find ||il]. 

Define a linear operator T € L(cy) as follows: for x = (x,,), T(x) = (). Find 
|| T|| and show that R(T) is dense in co. ? 

In connection with example 1, show that ||T|| = ||cl|,o- 

Let X be the space of polynomials equipped with the norm || f|| = supo<.<1 
|f(x)|. Prove that differentiation is an unbounded operator on X. 

Define a function ||.||’ on the space of null sequences cy by ||x||’ = 
ye 2~"|x,|. Here x = (x,,). Prove that the given function is a norm and 
that it is not equivalent to the infinity norm on co. Hint: The sequence 
(1, 1,..., 1,0,0,0,...) is Cauchy in |].||’. 

Let ||.||,; and ||.||, be equivalent norms on a Banach space X. Prove that the 
closed unit balls B} ={x EX: ||x||, < 1} and B, ={x EX: ||x||, < 1} are 
homeomorphic. Hint: Consider the function g : B, > B, defined by 


Prove that cy = /' and (/')* = 1°. 

Leta : X > Kbeanonzero linear functional on a vector space X. Prove that 
there exists a one-dimensional subspace M of X such that X = Ker(A) ® M. 
Let A : X > K be a nonzero linear functional on a normed linear space X. 
Prove that the following are equivalent: 

(a) A is bounded. 

(b) Ker(A) is closed. 

(c) Ker(A) is not dense in X. 

Conclude that A is unbounded if and only if Ker(A) is dense in X. 

Let A : X > K bea nonzero linear functional on a normed linear space X. 
Prove that the following are equivalent: 
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20. 


21. 


22. 


23. 


24, 


25. 
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(a) A is unbounded. 

(b) There is a sequence (x,,) in X such that ||x,,|| = 1 and lim, |A(x,,)| = 00. 

(c) There is a sequence (x,,) in X such that lim, x, = 0 and A(x,,) = 1. 

Let T : €[0,1] > C[0,1] be the linear operator (Tf)(x) = Sc f@adt. Show 

that T is bounded, and find its norm. 

Let A : €[0,1] > K be the linear functional A(f) = Sy f(t. Show that 

is bounded, and find its norm. 

Define a linear operator P,, on the space of convergent sequences c by 

P(X) = (4X95 oes Xp Xp X ppp eee) 

(a) Prove that ||P,,|| = 1 and that lim P,(«) = x for all x € X. 

(b) Prove that u, = (1,1,1...),u, = (0,1,1,1,...),u3 = (0,0, 1,1,1,...), ..., is 
a Schauder basis for c. 

Let {t,,} be a countable dense subset of [0, 1], where t, = 0,t, = 1. Forn EN, 

define an operator P,, on C[0, 1] as follows: P,, fis the continuous, piecewise 

linear function with nodes f,,...,f, such that (P,f)(t;) = f(t;) for 1<i<n. 

Show that ||P,,|| = 1 for all n © N and that, for every f € C[0, 1], lim, ||P, f— 

Fillao = 0. 

This is a continuation of the previous exercise. Define u,(x) = 1, and, for 

n> 2, define u,, to be the continuous, piecewise linear function such that 

u,(t,) = 1, and u,(t;) = 0 for 1 <i<n—1. Prove that {u,}!L, is a basis for 

the range of P,,, and hence conclude that {u,,}"°., is a Schuader basis for 

€[0, 1]. 


Definition. Let {u,,} be a Schauder basis for a Banach space X. Thus every 
: : foo) . 

x € Xhas a unique representation x = ))_, a,(x)u,,. Define the canonical 

projections P,, : X > Span{uy,...,u,} by P,(x) = yey a;(x)u;. Notice that 

the last three problems include examples of canonical projections. We 

assume, without proof, the fact that the set {P,,} is uniformly bounded, that 

is, sup,,||P,,|| < 00. 


Let {u,,} be a Schauder basis for a Banach space X, and consider the series 
representation x = yes a,u, of an element x € X. Each of the coefficients 
a, is clearly a linear functional on X. Prove that a, € X*. Hint: a,(x)u, = 
Pa(x) — Pri (x). 


6.3 Three Fundamental Theorems 


In addition to the Hahn-Banach theorem, the three theorems we present in this 
section are of fundamental importance. All three theorems require completeness; 
hence they apply only to Banach spaces. 
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In chapter 4 (see problem 5 on section 4.8), we encountered an example 
where a family of pointwise bounded functions on a complete metric space is, 
in fact, uniformly bounded on a ball. Lemma 6.3.1 is similar in spirit, and its 
proof demonstrates the centrality of Baire’s theorem in this section. Because the 
boundedness of a linear function on a ball implies its boundedness, it must not be 
surprising that when X is a Banach space, pointwise boundedness implies uniform 
boundedness. This is the uniform boundedness principle. 

The open mapping theorem is a central theorem in functional analysis, and 
one cannot exaggerate its importance. Lemma 6.3.3 is critical to the proof of the 
open mapping theorem, and, again, completeness is crucial. The closed graph 
theorem comes in quite handy in certain applications to prove the boundedness 
of a linear function. It follows rather easily from the open mapping theorem. Later 
in the book, you will see many applications of the three theorems, as well as the 
Hahn-Banach theorem. 


In this section, X and Y are normed linear spaces. 


A family of bounded linear functions {Tz }q<; from X to Y such that, for each x € X, 
SUP geri|| Ta (x)||} < co is said to be pointwise bounded. If sup,<;||Tz|| < 00, we say 
that the family {T,,} is uniformly bounded. 


Example 1. Let X and Y be normed linear spaces, and suppose that dim(X) < 
oo. If a family of linear transformations {Tz} e,; from X to Y is pointwise 
bounded, then supgez||Tx|| << co. To see this, fix a basis {x,,...,x,} for X, 
and use the 1-norm on X. Thus if x = ae a;x;, then ||x|| = pian |a,|. Define 


M; = supeey||Tae(x;)||, and let M = max,<j<,M;. For any a € I, we have 


I|TaC)II = 


ly, (2 os) 
i=1 


n n 
< lal Te(xall < >) Milail < Mllxll. @ 
i=1 i= 1 


The uniform boundedness principle generalizes example 1 to the infinite- 
dimensional case. 


Lemma 6.3.1. Let X be a Banach space, and let Y be a normed linear space. 
Suppose {Txtqe; is a family of bounded linear functions X > Y such that, for 
each x € X, supger{||Ta(x)||} < 00. Then there exists a ball B(x9,5) such that 
sup ||T¢(x)|| : x € B(xo, 6), a € I} < 00. 


Proof. For each n€EN, let F, = Neekx € X : || Tae(x)|| <n}. Note that each F.,, is 
closed and that X = U?_,F,,. Since X is complete, Baire’s theorem forces at least 
one set Fy to have a nonempty interior. Thus there exists a ball B = B(x ,d) € 
int (Fy) © Fy. Now, for every x € B and every a €I,||Tz(x)||< N. & 
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Theorem 6.3.2 (the uniform boundedness principle). Under the assumptions 
of lemma 6.3.1, the family {T,,} is a bounded subset of £(X,Y), that is, 
SUPe||T || < 00. 


Proof. We continue to use the notation of lemma 6.3.1. 
Let 6 =supg||T(%)||. We claim that sup}||T,(x)]| : ||x|| <lae< 
=(N+ B). This will show that sup{||Ty||: aE B< =(N+8). Let a EI, and 


||x|| <1. Then xy + = € B, and 


dx 


2 Ox 2 
Tal = FllTa(F)Il= Flla(x0+ 5 


)~ Tax) 
 FAlTa(%o+ YI + Ta) s SON+ 6). 


The above theorem fails when X is not complete. See problem 1 at the end of this 
section. 


In the lemma below, we use the notation Bs to denote an open ball in X of radius 
6 centered at 0. We use the same notation, in addition to the prime character, 
to indicate an open ball in Y. Thus By denotes an open ball in Y of radius r and 
centered at 0. 


Lemma 6.3.3. Suppose that X and Y are Banach spaces and that T is a bounded 
linear mapping from X to Y. If, for some r>0,B, € T(B,), then By C T(B3). 


Equivalently, B’,, © T(B,). 


Proof. First observe that Bi. C T(B,) implies that B si € T(By2:), for every iEN. 
Pick y € B;. There exists x, € B, such that ||y — T(x,)|| < r/2. Now y— T(x,) € 
Bin C T(Bij), so there exists x, € Byjy such that ||y— T(x) — T(x,)|| < 1/4. 
Continuing in this manner, we can construct a sequence (x,) in X such that 
Xq E Big (i.e.; ||xXq|| <1/2"71), and |ly — T(x,) — T(x2) — ... — T(x,)|| < 1/2”. 
Because ||x,|| < 1/2"7', the sequence S,, = x, +... +X, is a Cauchy sequence in 
X; hence x = lim,,S,, exists. Now T(x) = T(lim, S,) = lim, T(S,,) = y, and ||x|| = 
lim, [[Spl| = lima, 00 [ly + +--+ Xall Slim, DO, [lol] S$ DZ, V2 =2<3. We 
have shown that every y € B, is the image of an element x € B3. This proves the 
result. 


Theorem 6.3.4 (the open mapping theorem). Suppose that X and Y are Banach 
spaces and that T : X — Y is a bounded linear mapping from X onto Y. Then there 
exists a number 6 > 0 such that B, C T(B,). 
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Proof. Since T is onto, Y=U2,T(B,). Baire’s theorem implies that T(By) 
has a nonempty interior for some positive integer N. Thus there exists an element 
¥o € Y and a positive number r such that Blyo,r) C T(By). We claim that 
By © T(By). Let y € Y be such that ||y||< 1, and let € > 0. Both yy and yy +y 
are in B(yo,r), so there are vectors € and y in By such that ||y + yo — T()|| < 
e/2, and |lyy—T(y)||<¢/2. Let x=§—n. Then ||x|| <|I§|| + |lyll <2N, 
and |ly— TDI = lly— TE — 0) — yo + yall S lly + ¥0 — TEEDII + IIT) — yoll < 
€/2+¢/2 =€. This proves that BC T(B,y), which establishes our claim and 


U 


implies that Bans T(B,). By lemma 6.3.3, By C T(B,), where 6 = a a 


Corollary 6.3.5. Under the assumptions of the open mapping theorem, given r > 0, 
there exists 5 > 0 such that B’, C T(B,). 


The following theorem justifies the name of the open mapping theorem. 


Theorem 6.3.6. Under the assumptions of the open mapping theorem, T is an open 
mapping. 


Proof. Let U be an open subset of X. We need to show that T(U) is open in Y. 
Let y= T(x) € T(U). Since U is open, there exists r>0 such that B(x,r) C U. 
Corollary 6.3.5 implies that there is a positive number 6 such that By C T(B,). 
Now T(B(x,r)) = T(x + B,) = T(x) + T(B,) Dy + Bi, = Bly, 6). This concludes 
the proof. 


The continuity of a function does not imply its openness. For example, the function 
f(x) = sinx is continuous but not open, since the image of interval (0, 7) is (0, 1]. 


Example 2. Under the assumptions of the open mapping theorem, there exists a 
constant M > 0 such that, for every y € Y, there is an element x € T~'(y) such 
that ||x|| < M||y||. By the open mapping theorem, there exists a positive number 


é f 
6 such that By ¢ T(B,). For a nonzero vector y € Y, FT = By, hence there is a 
y) 


6 2. 
vector x, € X such that ||x,|| <1 and T(x,) = aT Define x = et One can 
y 


2 2 
see that T(x) = y, and ||x|| < ot The constant we seek is M = 5 4 


The following results represent a small sample of applications of the open mapping 
theorem. 


Theorem 6.3.7. Let X and Y be Banach spaces, and let T : X + Y be a bounded 
bijection. Then T is a homeomorphism. 
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Proof. By theorem 6.3.6, T is an open mapping; hence T~' is continuous, and T is a 
homeomorphism. @ 


The above theorem also follows from example 2. If T is injective, then 
IT <M. 

The following theorem states that it is not possible for a complete norm to be 
strictly stronger than another. 


Theorem 6.3.8. Let X be a Banach space under each of the norms ||.|| and ||.||’. If 
there exits a constant a >0 such that ||x|| < a||x||’ for every x © X, then there 
exists a constant B > 0 such that ||x||’ < 6||x|| for every x € X. 


Proof. Consider the identity mapping Ix : (X,|].||) > (X,||.||). The assumption 
||x|| < e||x||’ is equivalent to the boundedness of Ix. By theorem 6.3.7, the inverse 
of Ix is also continuous. Thus Ix : (X,|].||) > CX,|].||[) is bounded. Thus there 
exists a positive constant B such that, for all x € X, ||x||’ < 6||x||. 


Definition. Let (X,d) and (Y,¢) be metric spaces, and let T : X — Y. The graph 
of T is the subset G = {(x, T(x)) : x © X} of X x Y. We say that the graph of T is 
closed if G is closed in the product metric on Xx Y. 

Recall that a sequence (x,,,y,,) € X X Y converges to (x, y) if and only ifx,, > x 
and y,, > y. Thus the graph of T is closed if whenever x,, > x, and T(x,) > y, 
then (x,y) € G, or simply y = T(x). It is a simple exercise to verify that if T is 
continuous, then the graph of T is closed in X x Y. For Banach spaces and linear 
mappings, the converse is true. 


Theorem 6.3.9 (the closed graph theorem). Let X and Y be Banach spaces, and let 
T be a linear mapping from X to Y. If the graph of T is closed, then T is bounded. 


Proof. Define a norm on X as follows: \|x||’ = ||x|| + ||T(@||. We first show that ||.||' 
is complete, and hence (X,||.||') is a Banach space. If (x,) is a Cauchy sequence 
in ||.||', then, for € > 0, there is a natural number N such that, for m,n > N, 
|X, —X,||' <€. In particular, both (x,,) and (T(x,)) are Cauchy sequences in X 
and Y, respectively. The completeness of X and Y guarantees that both sequences 
converge, say, x = lim,,x,, and y = lim,, T(x,,). The assumption that the graph of 
graph of T is closed implies that y = T(x). Now ||%n —x||’ = ||xn — || + ||T@n) — 
T(x)|| = ||x, — || + ||T@&n) — y|| ~ 0 as n > 00. This demonstrates the complete- 
ness of \|.||'. Now ||x|| < |||] + ||T@]| = ||x||’. By theorem 6.3.8, the two norms ||.|| 
and ||.||’ are equivalent; thus the boundedness of T in one norm is equivalent to its 
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boundedness in the other. But the boundedness of T in the ||.||' norm is immediate 
from the inequality ||T(x)|| < ||T@)|| + |lx|| = ||x||’. Hl 


The following examples show that both the linearity of the T and the completeness 
of the spaces are needed for the closed graph theorem to hold. 


Example 3. The function f : R — R, defined below, is discontinuous but its graph 


is closed: 
1 
- ifx#0, 
AO=4* | 
0 ifx=0.¢ 


Example 4. Let X = C[0,1] be equipped with the 1-norm, and let Y = €[0, 1] be 
equipped with the uniform norm. The identity function I: X > Y is discon- 
tinuous by example 7 on section 3.6. However, the graph of I is closed. Sup- 
pose lim, |If,, — fl], =0 and lim, |J(f,) — glloo = limy Ilf; — glloo = 0. Since con- 
vergence in the uniform norm implies convergence in the 1-norm, lim, ||f, — 
gil; = 0. Now the uniqueness of limits forces f= g. 


_ 


Exercises 


. Leta, : IK(N) > K be the functional defined by A,(x) = nx, Prove that the 


set {A,,} is pointwise bounded but not uniformly bounded. Here x = (x,,), and 
X(N) is given the supremum norm. 


. The Banach-Steinhaus theorem. Let X and Y be Banach spaces, and let 


(T,,) be a sequence of bounded linear mappings from X to Y such that, 
for every x € X, T(x) =lim,T,,(x) exists. Prove that T is bounded and that 
||T|| < liminf,, ||T,,||. Is it necessarily true that lim, T,, = T in £(X, Y)? 


. Let (y,,) be a sequence such that ae XyVn < co for all sequences (y,,) € 11. 


Prove that (x,,) € ?. Here p and q are conjugate Hélder exponents with p > 1. 


. Let (y,,) be a sequence such that pa 1XnYn < 00 for all sequences (y,,) that 


converge to 0. Prove that ae lyn] < co. 


. Let X bea Banach space, and suppose that the sequence /,, € X* is pointwise 


bounded. Prove that ,, is equicontinuous. 


. Let M and N be closed subspaces of a Banach space (X,]||.||) such that 


X= MON. Thus every x € X can be written uniquely as x = y +z, where 
y €M,zeEN. Define a norm on X by ||x||’ = |[y|| + ||z||. Prove that ||.||’ is 
equivalent to ||.|]. 
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6.4 The Hahn-Banach Theorem 


The importance of the Hahn-Banach theorem cannot be overstated. The results 
following theorem 6.4.4 represent only a sample of the wide range of applications 
of the Hahn-Banach theorem. Unlike the three major theorems of the previous 
section, the Hahn-Banach theorem does not require completeness. 


The Hahn-Banach theorem has many guises, and one of them is an extension 
theorem. The following example shows that, from the purely algebraic perspective, 
extending a linear functional on a subspace M of a vector space X is a trivial task. 
Compare the following example to theorem 6.4.4. 


Example 1. Let M bea subspace of a vector space X, and let A be a linear functional 
on M. Then A can be extended to a linear functional on X. 


Let S, be a basis for M, and choose a subset S, of X such that S; US, is a basis 
for X. Define a function A : S > C as follows: 


Ate A(x) ifxeS, 
0 ifx ES). 


Extend the function A by linearity to a functional A on X. The restriction of A 
to M is clearly A. @ 


One of the corollaries of the Hahn-Banach theorem (theorem 6.4.5) is a powerful 
separation theorem. Earlier in the book, we saw examples of separation theorems 
by linear functionals, albeit ina slightly different context. See example 10 in section 
4.7. The following example shows, once again, that, from the algebraic point of 
view, the problem of separating a subspace from a point outside it is a simple one. 
Compare the result below to theorem 6.4.5. 


Example 2. Let M be a proper subspace of a vector space X, and let x» € X— M. 
There exists a linear functional 2 on X such that A(M) = 0 and A(x) £0. 


Choose a basis S, for M. Since S, U{x} is independent, there is a subset 
S, of X such that S$; U{xo}U S, is a basis for X. Define a function A : S,; U{xo}U 
S5 = C by 

1 ifx=Xp, 


A(x) = 
0 ifx Ee S5 U S. 


Extend A by linearity to a linear functional A on X. Clearly, A(M) = 0. @ 
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Definition. Let X be a complex normed linear space. A real functional u on X is 
said to be a bounded real-valued functional if 


(a) u(x+y) = u(x) + u(y) for all x,y € X, 
(b) u(ax) = au(x) for all x € X anda e€ R, and 


(©) [lull = Suppo 


Lemma 6.4.1. Let A be a bounded complex functional on a normed linear space X, 
and let u be the real part of A. Then 


(a) A(x) = u(x) — iuCix), and 

(b) wis a bounded real functional on X and ||u|| = ||A\|. 
Conversely, if u is a bounded real functional on X, then A(x) = u(x) — iu(ix) 
is a bounded complex functional on X. 


Proof. Write A(x) = u(x) + iv(x). On the one hand, ACix) = idA(x) = iu(x) — v(x). 
On the other hand, Aix) = u(ix)+ iv(ix). Equating the right-hand sides 
of the above identities yields v(x) = —u(ix), and hence (a). Since, for any 
complex number z,|Re(z)| < |z|,|u(x)| < |AG@)|; hence ||ul| < ||Al|. If AG) 4 0, 
ler = 22, Then |A(x)] = a2(x) = A(x) = u(aes) < lull = ft els] = 
||u||||x||. Thess ||A|| < |u|], and this establishes (b). 

Conversely, if u is a bounded real functional on X and A(x) = u(x) — iu(ix), 
then the additivity of A is straightforward. Now A(ix) = u(ix)—iu(—x) = 
u(ix) + iu(x) = ilu(x) — iu(ix)] = iA(x). Hence A((a + ib)x) = A(ax) + A(ibx) = 
ad(x) + iA(bx) = ad(x) + ibaA(x) = (a+ ib)A(x). Thus A is complex linear. The 
boundedness of A follows from the proof of part (b). ™ 


Lemma 6.4.2. Let M be a a subspace of a real normed linear space X, and let 
Xo © X— M. If u is a bounded real functional on M, then u has an extension U to 
a bounded real functional on N = M @ Span{xo} such that ||U|| = ||ull. 


Proof. Without loss of generality, assume that ||u|| = 1. Every element of N can 
be written uniquely as x+ax 9, wherex€ M and a€R. Define U: N>R 
by U(x+axy) = u(x)+ab, where b is a constant to be determined later in 
the proof. The linearity of U is obvious, and since U extends u, ||u|| < ||U]]. It 
remains to show that ||U|| < 1. It suffices to show that a constant b exists such that 
|u(x) — b| < ||x—xo|| for every x EM, because then |U(x)| = |u(x)+ab| = 
|= a2 6] < | — axl] xo] = [lx + xxolls hence ||U|] <1. We now show 
that a constant b exists such that |u(x) — b| < ||x—xo|| for every x EM. For 
x,y € X,u(x) — u(y) = u(x— y) $ |lllllx— yll = lle — yl S lle — oll + lly — 0ll- 
Therefore u(x) —||x—xol| SuQ) + lly—xol), and by = supyemtu(x) — ||x — 
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Xoll} < infpemtu(y) + lly — xoll} = 02. Any constant b such that b)<b<b, 
satisfies u(x) — ||x —xo|| < b < u(x) + ||x — xl] for every x EM. For such a 
constant b, |u(x) — b| < ||x — x9||. The existence of b concludes the proof. Mi 


Lemma 6.4.3. Let M be a subspace of a real normed linear space X, and let u be a 
bounded real functional on M. Then u has a bounded real extension, U, on X such 
that ||U|| = ||u\I- 


Proof. Consider the family B = {(Mg, Ug) : a € I} of extensions of u that satisfy the 
conclusion of the theorem. Thus, for each a € I, Mis a subspace of Mz, Ug, extends 
u, and ||Ug|| = ||u||. Order B by set and function inclusion: (Mg, Ug) © (Mg, Ug) 
means, by definition, that My, © Mg, and Ug extends Ug. If {© = (Mg, Ug) : 
a € Jt is a chain in B, let (N, U) = (Uge;Maq,UgesUq). It is easy to verify that N 
is a subspace of X, that U is well defined and, linear and that ||U]| = ||ul|. All the 
properties follow from the fact that © is a chain. Thus (N, U) is an upper bound of 
©. By Zorn’s lemma, 8 has a maximal member, (M*, U*). It must be the case that 
M* = X because otherwise we can pick an element xy € X — M and use lemma 
6.4.2 to extend U* to M* ® Span{xy}, which would contradict the maximality of 
(M*,U*). 


Theorem 6.4.4 (the Hahn-Banach theorem). Let A be a bounded linear functional 
on a subspace M of a complex normed linear space X. Then A has an extension to 
a bounded linear functional, A, on X such that ||A|| = ||A|I. 


Proof. Consider X as a real normed linear space simply by limiting the scalar field to 
R, and let u be the real part of A. By lemma 6.4.1, u is a bounded real functional 
on M, and |\ul| = ||A||. By lemma 6.4.3, u has an extension, U, to X such that 
|| U|| = ||u||. Define A : X > C by A(x) = U(x) — iU(ix). By lemma 6.4.1, A is a 
bounded linear functional on X, and ||A\|| = ||U|| = ||u| = ||A||. 


We now look at some applications of the Hahn-Banach theorem. The results below 
are important in their own right. 


Theorem 6.4.5. Let M be a subspace of a normed linear space X, and let xo € X. 
Then xy € M if and only if there does not exist a bounded linear functional A on 
X such that A(M) = 0 and A(xy) #0. 


Proof. We show that if x9 ¢M, then there exists a functional A € X* such 
that A(M)=0, and A(x 9) =1. Since xo ¢ M, there exists a number 5>0 
such that ||x—xo|| = 6 for every x E M. Let N= M@ Span{xo}, and define a 
function A: N> C by A(x + ax) = a; A is clearly linear on N,A(M) = 0, and 
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A(xp) = 1. Now, for any xe M,a# 0, —Xo|| >6. Thus d|A(x+axp)| = 

S|ae| < Jor||| — —xp|| = |x + exo]. Thus A is bounded on N ((lA|| < 1/6). 
—a 

Extend A to a bounded linear functional A on X. The functional A has the 

desired properties. Conversely, if x» €M and A € X* is such that A(M) = 0, 


then there exists a sequence of vectors (x,) in M such that lim,x, =x. Now 
A(xo) = Aim, x,,) = lim, A(x,,) = 0. 


The following result can be used to prove certain approximation theorems. It 
follows immediately from the previous theorem. 


Corollary 6.4.6. Let M be a subspace of a normed linear space X. If, for 
A € X*,A(M) = 0 implies that A = 0, then M is dense in X. & 


Example 3. Let A be a dense subset of [—7, 7]. For a fixed t € A, the sequence 
= (“y2, is in ?. We claim that the subspace M = Span{é, : t€ A} is 
dense in P. We use the above corollary and show that, for a bounded 
linear functional 2 on [?, ACM) =0 is possible only if 2 =0. By theorem 
6.2.4, there exists a sequence (y,) €/ such that, for every sequence x = 
(x,) € P, A(x) = yee For every t € [—7, 7], ae el =>” Val < 


n=1 y 
[oe] 11/2 : co Yn int 
IWllot d=) ai < oo, and the series }) _, en converges absolutely and 
uniformly on [—7, 7] to a continuous function F(t). By assumption, F vanishes 
on a dense subset of [—7,7], so F is identically equal to the zero function. 


Theorem 4.10.5 implies that % — 0 for alln EN. Thus y, =0,andd=0. 4 


Another corollary of the Hahn-Banach theorem is the following separation 
theorem. 


Corollary 6.4.7. Let X be a normed linear space, and let xy € X,xo # 0. Then there 
exists a bounded linear functional A on X such that A(xq) = ||xo||, and ||A|| = 1. 
In particular, if y € X and A(y) = 0 for all A € X*, then y = 0. 


Proof. Let M = Span{xo}, and define a functional A : M > C by A(axy) = |||. 
Clearly, A(x) = ||xo||, and ||A|| = 1. By the Hahn-Banach theorem, A has a norm- 
preserving extension to X. Mi 


The following important construction relies heavily on the above corollary. As we 


established in section 6.2, the dual X* of a normed linear space X is a Banach space 

with the norm |[A|| = sup,zo a ; X*, in turn, had a dual, X**, which is a Banach 
|X| 

space known as the second dual of X. Now X can be linearly and isometrically 


embedded into X™ as follows. For an element x € X, define an element x of X** by 
X(A) = A(x). The linearity of X, as well as that of the mapping X > X** defined 
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by x %, are obvious. We now show that ||x|| = ||x||. Since |%(A)| = |A(w)| < 
Walilbale sa < ||x||. Hence ||X|| < ||x||. We now show that ||%|| = ||x||. By corollary 
6.4.7, there exists A € X* such that ||A|] = 1, and A(x) = ||x||. Now |&(A)| = |A@)| = 


||x||. Therefore ||X|| > ||2|| and ||X|| = ||x||. We have proved the following result. 


Theorem 6.4.8. Let X be a normed linear space, and let p : X > X** be the function 
(x) = x. Then ¢ is a linear isometry. 1 


The function ¢ in the above theorem is known as the natural embedding of X into 
X**, We use the notation X to denote the range of g. Thus X = {% : x € X}. 

The above theorem provides the neatest construction of the completion of a 
normed linear space. 


Theorem 6.4.9. Let X be a normed linear space. Then X can be linearly and 
isometrically embedded as a dense subspace of a Banach space. Thus every normed 
linear space has a completion. 


Proof. We know that X** is a Banach space. Let X be the image of X under the 
natural embedding yp in theorem 6.4.8. The desired completion of X is the closure 
of X in X**. 


Definition. A Banach space X is reflexive if X = X**. Thus X is reflexive if every 
member of X™* is of the form X for some x € X. 


Example 4. The /? spaces are reflexive for 1 < p < oo. This follows directly from 
theorem 6.2.4. @ 


The result below is important in its own right, but it also helps us decide whether 
certain spaces are reflexive. 


Example 5. Let X be a normed linear space. If X* is separable, then X is separable. 


Let {,,} be a countable dense subset of X*. Since ||A,,|| = supy,y=1|A,@)|, there 
exist unit vectors x,, € X such that |A,,(x,,)| > ||A,,||/2. Let M = Span{x,,x,,...}. 
We employ theorem 6.4.6. Suppose that A € X* is such that A(M) = 0. Let 
€ > 0, and pick a positive integer n such that ||A,, — A|| < €. By the definition of 
x, and the fact that A(x,,) = 0, we have ||A,,||/2 < |A,(«,)| = |A,(x,) —A@,)| = 
(An —A)Xn)| < ||An — Al] < €. Therefore ||A|| < ||A —Ay|| + |[Anl| < € + 2€ = 3e. 
This means that A = 0, and, by corollary 6.4.6, M is dense in X. Now the 
countable set eae a,x; : nEN,a;€ Q+ iQ} is dense in X. 
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Definition. A closed subspace M of a Banach space X is said to be complemented 
if there exists a closed subspace N of X such that X = MON. 


The definition of the algebraic complement of an arbitrary subspace of a vector 
space was introduced in section 3.4. The current definition requires both M and 
N to be closed subspaces of X. The direct sum of two closed subspaces of a Banach 
space is sometimes referred to as the topological direct sum of the two subspaces. 


The very definition suggests that not every closed subspace of a Banach space has 
a closed complement. However, the following examples identify two important 
special cases where closed complements are guaranteed. 


Example 6. If M is a finite-dimensional subspace of a Banach space X, then M is 
complemented. 


Let {x,,...,x,} be a basis for M. For x € M,x = es a,(x)x; for a unique set of 
coefficients a,(x),...,4,(x). Each a; is a continuous linear functional on M. By 
the Hahn-Banach theorem, each a; has an extension to a functional A; € X*. 
Define an operator P: X > X by P(x) = Yi. It is easy to see that 
Pe L(X), that P(x) =x for every x € M, and that P? = P. Let N= Ker(P) = 
Nj, Ker(A;). Clearly, N is a closed subspace of X. For x € X,x = x — P(x) + P(x). 
By the above, P(x) € M, and x— P(x) EN, since P(x — P(x)) = P(x) — P*(x) = 
P(x) — P(x) = 0. This shows that M+ N=X. If x € MON, then A,(x) = 0 for 
every 1 <i<n;hencex = ee ax)x; = pe A,x)x; = 0. We have shown that 
M@N=X.4 


Example 7. If N is a closed, finite co-dimensional subspace of X, then N is 
complemented. 


Recall that the co-dimension of N is the dimension of the quotient space 
X/N. Pick vectors x,,...,x, such that {X,,...,x,} is a basis for X/N, where 
X, = x;+ N, and let M = Span{x,,...,x,}. We claim that M @® N = X. For x € X, 
x+N= ae a(x; + N) = Ors a;x;)+N. Therefore y=x— van ax; EN, 
and x= Di aki ty EM4N. If x MNN, then x= ya EN. Thus 
ee ,4X;=0; hence a;=0 for every 1<i<n by the independence of 
{X,,...,X,} andx=0. ¢ 


Exercises 


1. Let M be a closed maximal subspace of a normed linear space X. Prove that 
there exists a functional A € X* such that Ker(A) = M. 
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10. 


11. 
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. Let A be a subset of a normed linear space X. Prove that A is bounded in X 


if and only if, for every A € X*, A(A) is bounded in C. 


. Prove that if {x,,...,x,,} is an independent subset of a normed linear space 


X, and {q@,,...,@,} is an arbitrary set of complex numbers, then there exists 
A & X* such that A(x;) = a; for alll <i<n. 


. Let Mbea closed subspace ofa normed linear space X, and let x) ¢ M. Prove 


there exists 2 € X* such that A(M) = 0, ||A|| < 1, and |A(xo)| = dist(xo, M). 


. Prove that, for an element x of a normed linear space X, ||x|| = sup{|A(x)| : 


Ae X*, ||Al| = 1. 


Definition. A sequence (x,,) in a normed linear space X is said to converge 
weakly to an element x € X if, for every A € X*, lim, A(x,,) = A(x). We use 
the notation x,, >” x to indicate the weak convergence of (x,,) to x. 


. Prove that if (x,,) is weakly convergent, then (||x,,||) is bounded. 
. Prove that if x, >” x and y, >” y, then for any scalars a and b, ax, + 


by, >” ax + by. 


. Prove that the weak limit of a sequence, if it exists, is unique. 
. Prove that /! is not reflexive. 


Definition. A bounded operator P on a Banach space X is called a bounded 
projection if P’ = P. Equivalently, if Px = x for every x € M = R(P). See 
problems 13 and 14 on section 3.4 for the general properties of the projec- 
tion of a vector space onto a subspace. 


Let M be a closed subspace of a Banach space X. Prove that M is comple- 
mented if and only if there exists a bounded projection P on X such that 
M = ®(P). Hint: Suppose X = M @ N, where M and Nare closed subspaces 
of X, and let P : X > X be the projection of X onto M. Use the closed graph 
theorem to prove the boundedness of P. 

Suppose that M and N are closed, complementary subspaces of a Banach 
space X, and let T; : M— XandT, : N— X be bounded linear mappings. 
Define T : X > X by T(x) = T\(y) + T2(z), where x =y+z, yEM,zZEN. 
Prove that T is bounded. 


6.5 The Spectrum of an Operator 


The spectrum of a square matrix A is simply its set of eigenvalues, and the 
eigenvalues of A are easy to characterize. They are exactly the complex numbers A 
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for which the matrix A — AI is not invertible. We recall the simple fact that A — AI 
is not invertible if and only if the linear operator T it generates on KK” is not one- 
to-one, and this is the case if and only if T in not onto. 

The definition of the spectrum of an operator T on an infinite-dimensional 
space is exactly the same as it is for a matrix. The stark distinction here is that 
not every point in the spectrum of an operator on an infinite-dimensional space 
is an eigenvalue. This is because such an operator may be one-to-one but not onto 
or conversely. See example 1. Thus the spectrum consists of two main parts: the 
complex numbers A for which T — AT is not one-to-one (the eigenvalues) and those 
for which T— AT is one to one but not onto. The spectrum of an operator T often 
carries valuable information about T, and, in some cases, the eigenvalues of an 
operator and the corresponding eigenvectors completely define the operator. 


Definition. A Banach algebra is a Banach space X that is also an algebra with a 
multiplicative identity I such that the norm satisfies the following additional 
assumptions: 


(a) ||Z|| = 1, and 
(b) ||ST]| < ||S|| T|| for all S and Tin xX. 


We know that the set £(X) of bounded linear operators on a Banach space X is a 
Banach space. In fact, £(X) is a Banach algebra with the composition of operators 
as the multiplication operation. The composition of two operators S$ and T is 
usually denoted by ST rather than SoT. Property (a) is obvious, and property (b) 
follows from the inequalities ||(ST)(x)]] = ||S(7@0)Il $ [ISI TOI < lISHIITIlll 


For the convenience of the reader, we list below the properties that make £(X) a 
Banach algebra: for operators T,S,U € £(X) and all a,b € K, 


(a) (ST)U = S(TU), 

(b) (ab)T = a(bT), 

(c) (T+S)U = TU+ SU and U(T +S) = UT+ US, 
(d) |[J|| = 1, and 

(e) |ISTI| < |ISIITIL- 


The algebra £(X) is called the operator algebra on X. 


Definition. An operator T € £(X) is called invertible if there exists an operator 
S € £(X) such that ST = TS = I. 


If T is a bounded linear bijection of a Banach space X, then its inverse is bounded 
by theorem 6.3.7. Thus a bounded operator T fails to be invertible if 
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(a) Tis not one-to-one, that is, Ker(T) 4 {0}, or 
(b) Tis not onto, that is, R(T) # X. 


We point out here an important distinction between operators on finite vs. 
infinite-dimensional spaces. Every linear operator on a finite-dimensional space 
is bounded, and such an operator is one-to-one if and only if it is onto. This is not 
the case in infinite dimensions, as the following example illustrates. 


Example 1. The right shift operator R and the left shift operator L on I are, 
respectively, 


R(x1,X9, ++.) = (0,%1,%9,...), 


L(x, %2, ve) = (x53, vee) 


It is clear that R is one-to-one but not onto, while L is onto but not 
one-to-one. @ 


Definition. The spectrum, o(T), of an operator T € £(X) is the set of all complex 
numbers A such that T— AI is not invertible. It follows that there are two types 
of points in the spectrum: 


(a) Complex numbers A such that Ker(T — AD # {0}: Such a number J is called 
an eigenvalue of T. Specifically, A is an eigenvalue of T if there exists a nonzero 
vector x such that Tx = Ax. In this case, we say that x is an eigenvector of T 
corresponding (or belonging) to the eigenvalue A. The set of eigenvalues of T is 
known as the point spectrum of T. The set Ker(T — AI) is called the eigenspace 
of T corresponding to the eigenvalue A. 


(b) Complex numbers A such that T — AI is one-to-one but not onto, that 
is, R(T—AD # X. We will not dwell on this part of the spectrum, since the 
eigenvalues are the only important part of the spectrum for our purposes. 


The complement of the spectrum of T in the complex plane is called the resolvent 
set of T and is denoted p(T). Thus 2 € e(T) if and only if (T—AD7! exists. If 
A € p(T), we use the notation T, to denote (T—AD7!. 


Example 2. Define an operator T on C[0, 1] as follows: for f € C[0, 1], (TA(x) = 
xf(x). The reader can easily verify that T has no eigenvalues. Thus the spectrum 
consists only of complex numbers A for which T—AI is not onto. For Ae C 
and g € C[0, 1], if there exists a function f € C[0, 1] such that (T — ADf= g, then 
fw= = Therefore the spectrum is the interval [0, 1]. 


BANACH SPACES 275 


Example 3. Every complex number A in the open unit disk is an eigenvalue of the 
left shift operator on P. 


If O4A EC and |A| <1, then the vector xg = (A,/’,/°,...) is clearly in P 
and L(x,) =Ax,. Also, A = 0 is an eigenvalue of L because L(e,) = 0, where 
e, = (1,0,0,...). @ 


Lemma 6.5.1. If T€ £(X), and ||T|| <1, then I-—T)7! exists, (-T)7'= 


fore) ee 1 
Yreo Mand || T)"I] <=, 


' [o) [oc) 1 ‘ [os) 
Proof. First observe that >) \IT"|| < Do lI TIl” = = Thus the series ¥)_, I” 
converges to an operator S € £(X). Now (I- De T' =I-—T"*!. Taking the 


limit as n > 00, (I— T)S = I. Similarly, SU— T) = I; hence (I- T)"' = S. 


Theorem 6.5.2. Let T € £(X). Ifa €C and |A| > ||T||, then A € p(T). 
Proof. Since ||T|| < |A|,||T/A|| < 1. By lemma 6.5.1, I—T/A)7! exists. Thus T — AI 
is invertible since (T— AY —A!U—T/A)~!. Notice that, in this case, 
=—(Fa4pjeS a ye 
T, =(T-AD "= 5 Detar: a 
Corollary 6.5.3. The spectrum o(T) of an operator T € £(X) is bounded. 


Proof. By theorem 6.5.2, o(T) is contained in the closed disk {zEC: 
lz| <||7\[3. a 


Theorem 6.5.4. The spectrum o(T) of an operator T € £(X) is a closed, hence 
compact, subset of C. 


Proof. We show that e(T) is an open subset of C. Let Ay € p(T), and let a € C. Recall 
the notation T, =(T—AD7!. Now 


If |A—Ag| < 1/||Ta, || then, by lemma 6.5.1, I—(A—Ao)T4, is invertible, and 
hence T — AL is also invertible, being the composition of invertible operators. This 
shows that e(T) contains the disk in the complex plane centered at Ag of radius 
1/||T,, ||, and therefore p(T) is open in C. 


Definition. The spectral radius of an operator T is the number 


r(T) = sup{|A| : A € o( TH}. 
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Thus r(T) is the radius of the smallest closed disk in the complex plane that 
contains o(T). By theorem 6.5.3, r(T) < ||T||. It is possible that r(T) < ||T||. See 
problem 5 on section 7.4. 


Example 4. Let L : ? > P be the left shift operator. It is clear that ||L(x)||> < ||x||, 
and since ||L(e,)||, = lle, ||, = 1 = |le||2, ||L|| = 1. Therefore the spectrum of 
R is contained in the closed unit disk, D. It follows directly from this and 
example 3 that o(L) = D. Thus, ||L|| = r(L) = 1. @ 


The last conclusion of the previous example is true for the right shift operator. We 
derive it without directly computing o(R). 


Example 5. For the right shift operator R, ||R|| = r(R) = 1. 


Since R is an isometry, ||R|| = 1. The result follows if we prove that 2 = 1 € a(R). 
We show below that R — I is not onto. 
We formally compute the inverse image, x = (x,,), under R — I of an element 


y=(,) € P. If (R—-D() = y, then (—x1,x1 — X2,X2 — X3,---) = (15 Y25-): 
Equating the corresponding terms, we have —x, = y1,X) —X2 = 2,..,X,_— 


Xn+1 = Vat ie 
Solving for x, we have x; = —y1,X_ = —Y1 — Vos-.5X_y = Vy — V2 — 1 Vp 
Now if y= (+) P, then there is no x EP such that (R—D)(x) =y since 


n 1 
X= — Qin 7 > —0- @ 


Before we show that the spectrum of a bounded linear operator on a Banach space 
is not empty, we need to establish the following identity: for A and u € e(T), 


Ty -Ty=A-WTaTy, (1) 


Ty =(T-AD" = T(T-BDT, = T[T—Al + (4 — wT, 
= T+Q-M TIT, = Ty +QUA-—WTAT,. 


We need the following result from complex analysis, which we state without proof. 


Lemma 6.5.5 (Liouville’s theorem). If F(z) is a bounded differentiable complex 
function defined on the entire complex plane, then F is constant. 


Theorem 6.5.6. The spectrum of a bounded linear operator T on a Banach space X 
is nonempty. 
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Proof. Suppose, contrary to the above statement, that o(T) =@. Thus e(T) =C. 
For an arbitrary but fixed functional g € (£(X))*, define a function F: C>C 


by F(A) = g(T,). By identity (1), om = 9(T,T,)- 


As MA, 9(T,Ty) > 9(T2). Therefore F(A) =limy..g oe = oT?) 
and F is differentiable at every point of the complex plane. If |A| > 1+ ||T||, then, 
by lemma 6.5.1, 


1 1 
T\|= ee —I)- 1 < ————— = ——__. <1. (2) 
Wall= ql -9"'l'< Gp > BIST 


Thus ||T,|| is bounded by 1 outside the closed disk, D, of radius 1 + ||T||. Therefore, 
outside the disk D, |F(A)| = |g(T,)| < ||gl|. Because F is continuous on D, it is 
bounded on D; hence F is a bounded differentiable function on the entire complex 
plane. By lemma 6.5.5, F(A) is constant. If € > 0, there exists a positive constant R 
such that ||T,|| < € for |A| > R (see inequality (2) above). Consequently, for such 
A, |F(A)| < |lglle. Since € is arbitrary, and F is constant, FA) =0 for alla € C. 
Now since g is an arbitrary element of (£(X))*, T, = 0 (see corollary 6.4.7). This 
is impossible because T, is invertible. 1 


The following formula for the spectral radius is well known. 


Theorem 6.5.7 (Gelfand’s theorem). Let T be a bounded operator on a Banach 
space X. Then r(T) = lim,, ||T"||!/". 


Proof. By problem 9 at the end of this section, r(T") = [r(T)]". Therefore r(T) = 
[r(r")]/" < || T"||/", and r(T) < liming, ||T"||!/". The proof will be complete if 
we show that lim sup, ||T"||""" < r(7). 

Let 2 . C be such that |A|> ||T\|. By theorem 6.5.2, T; =(T-AD7' = 
= pee oe 7 Ife € (L£(X))*, then g(T,) = + ye ee. By the proof of theorem 
6.5.6, the function F(A) = g(T,) is differenHable for alla € p(T); thus the function 


F(A) extends the series = yo a ™) to the set {zEC: |A| > r(T)}. Therefore 
the series expansion = ae ” is valid for all complex numbers A such that 


|A| > r(T).? Now, for an arbitrary real number a > r(T), the series zi ys a) 


is convergent; hence the sequence o—) is bounded. Since g€(L(X))* is 
arbitrary, T"/a" is bounded in £(X). Let K> 0 be such that ||T"/a"|| < K. Then 
7" ||" < K""a, andlim sup, ||T"||'!" < a. Since ais an arbitrary number greater 
than r(T), limsup, ||T"||\/" < r(T). 


> The series involved here are Laurent series. 
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Exercises 


1. Show that the composition of invertible operators is invertible. 

2. Show that if lim,, T,, = T, and lim, S, = S in £(X), then lim, T,,S,, = TS. 

3. Let A,,...,2, be distinct eigenvalues of a bounded operator T, and let 
Uj,...,U, be eigenvectors that correspond to Aj,...,A,,, respectively. Prove 
that u,,...,u, are independent. Hint: Use induction on n. 

4. Prove the following version of lemma 6.5.1. If T€ £(X) is such that 
|I-T|| <1, then T is invertible, T-'= 0° 1-1)", and ||T-!|| < 

1 


1-||I-7| 

5. Let TE L(X) be an invertible operator, and let S € £(X). Prove that if 

\|S— T|| < —_., then S is invertible. 
7"Il 

6. Let T,S € £(X) be invertible operators, such that ||S— T|| < Tete 
that ||S~! — T7}|| < 2||T~!|[?||S— T||. Hint: First show that ||7— 7T7!S|| < 
1/2, then use the identity S~' = [[—(I— T~!S)]~'T™! to show that S~! — 
T= del yr 

7. Let Ube the set of all invertible operators in £(X). Prove that U is open in 
£(X) and that inversion is a homeomorphism on U. 

8. Let T and S be commuting bounded linear operators on a Banach space 
X. Prove that if ST is invertible, then S and T are invertible. Also give an 
example of two singular operators whose composition is invertible. 

9. Prove that, for T€ £(X), o(T") = {u” : uw E€ o(T)}. Conclude that r(T”) = 
[r(T)]". Hint: Let 2 € C, and let t” —A = (t— 4,)...(t- ,,). Then T” — AIT = 
(T — f,1)...(T — yD). 

10. For a fixed function w€C[0,1], define an operator T on C[0,1] by 
(Tf)(x) = f(x)w(x). Show that T is a bounded operator and that ||T|| = 
|||. Also give a sufficient condition for T to be invertible. 


Prove 


6.6 Adjoint Operators and Quotient Spaces 


In section 3.7, we defined the adjoint of an operator on a finite-dimensional inner 
product space, and, in chapter 7, we will study adjoints of operators on a Hilbert 
space. The definition of the adjoints on Banach spaces X is more complicated. In 
fact, the adjoint of a bounded operator on a Banach space X is a bounded operator 
on the dual space X*. Among other results, we prove that an operator T and its 
adjoint, T* have the same norm, the same spectrum, and the same spectral radius. 
We also study annihilators and quotient spaces. Little subsequent material rests on 
this section, and it is possible to study the remainder of the book independently 
of this section. 


Notation. The duality bracket: Let X be a Banach space. For x € X andd € X*, we 
write (x,A) for A(x). This is a notational convenience that also facilitates certain 
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computations. In addition, the notation equalizes the roles of X and X*. We already 
saw that X acts on X* in much the same way X* acts on X. See, for example, the 
construction leading up to theorem 6.4.8. Observe that |(x,2}| < ||-||||A||, reminis- 
cent of the Cauchy-Schwarz inequality. We revert to the traditional notation A(x) 
when convenient. 


Theorem 6.6.1. Let X be a Banach space, let x € X, A € X*, and let T € £(X). Then 


(a) \|A|| = supt|(x,A)| + x € X, ||x|] < 1 
(b) |lx|| = ||€|] = supt|(x,a)| A EX", ||| < 1, and 
(c) ||T|| = sup{|(Tx,A)| : x € X,A © X*, |x|] < 1, |||] < 13 


Proof. (a) and (b) are previously established facts in new notation. To prove (c), 


ITI] = supt||Tx|| + [lol] <1 = supyrycisuPyaycal(Tx,A)| 
= supt|{Tx,)| : ||x|| <1, [|All <1. 


Definition. Let T € £(X). We define the adjoint operator T* on X* by the 
requirement that for all x € X, 


(Tx, A) = (x, T*(A)). 


Using conventional notation rather than duality brackets, the requirement in 
the above definition can be written as A(Tx) = (T*(A))(x) for every x € X. This 
simply means that T*(A) = AoT, which can well be taken as the definition of the 
operator T*. It is obvious that T* € £(X*). 


Example 1. In this example, we use theorem 6.2.4 and identify (/')* with I. For 
elements x = (x,) €/! and A = (A,,) € I®, define T(x) = (x7,x3,...) and S(A) = 
(0,A;,Ap,...). Clearly, T € L(I') and S € L(I*). We claim that S = T*. We need 
to verify that AoT = S(A), which is straightforward since, for x € I', A(T(x)) = 


Drei Antne1 = (SANG). 
Theorem 6.6.2. ||T™|| = ||T|l. 
Proof. By theorem 6.6.1, 


ITI] = supt(Tx,)| : |lx|| <1, [Al] <3 
= sup{|(x, T*A)| : |lxl| <1, [Al] < 3 
= supi||T*A|| : |All $= ||T"|| 
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Example 2. r(T) = r(T*). 
It is straightforward to show that, for n EN, (T")* =(T*)". It follows that 
\|(T)"|| = ||(T")* || = ||T"||- Employing theorem 6.5.7, we have 


r(T*) = lim ||(T*)" ||)" = lim |] T" ||!" = r(T). @ 
The next example utilizes several of the ideas of sections 6.3 and 6.4. 
Example 3. An operator T € £(X) is invertible if and only if T* is invertible. 


If Tis invertible, then TT~' = T~!T = Ix. Equating the adjoints of the operators 
in the previous identities and using problem 2 at the end of this section, we have 
(Peer Ss PT bt = ie. ings) Sar iy, 

Now suppose 7* is invertible, and write S = (T*)7!. 

We show that T is bounded away from zero. It then follows from example 8 
on section 6.2 that R(T) is closed and that T is injective. Now 


Ilx|| = suptl(x,A)| A € X*, [lal] < 
= sup{|(x, T*SA)| 1 2 E X*,|A|| < 
= sup{|(Tx,SA)| AEX", ||Al| < 1} 
< ||Tx||supt||SA]] : A EX", |||] < = ||TaIllISI. 


Thus ||Tx|| > c||x||, where c = 1/||S|I. 

If we show that T is surjective, then T is invertible by theorem 6.3.7, and 
the proof will be complete. Suppose there is an element y € X — R(T). Since 
R(T) is closed, theorem 6.4.5 yields an element A € X* such that A(y) 4 0 and 
A(Tx) = 0 for every x € X. Now (T*A)(x) = A(Tx) = 0 for every x € X; hence 
T*A = 0. Since T* is injective, A = 0, and this is a contradiction. @ 


Example 4. o(T) = o(T*). 
For any A € C, (T—Al)* = T* — AI. By the above example, T — AI, is invertible 


if and only if T* — Aly. is invertible, and the result follows. Observe that the 
result of example 2 is a trivial consequence of this result. @ 


Theorem 6.6.3. Let T € £(X). Then, for every x € X, (Tx) = T**(X). 


Proof. For 2 € X*, (Txf(A)=A(Tx) = (AoT)(x) =(T*A)(x) = X(T'A) = (KoT*)(A) = 
(T**(&))(A). Thus (Tx) = T**(%). 


Loosely interpreted, the above theorem says that T is the restriction of T™ to X. 
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Definition. Let M be a subspace of a Banach space X, and let N be a subspace of 
X*, The annihilator M+ of M consists of all the functionals in X* that vanish 
on M. Symbolically, Mt = {A € X* : A(M) = 0}. Similarly, the annihilator of N 
isN, ={xEX: Aw) =OVAEM. 


Example 5. Let M be the set of all sequences in /' where every even term is 0. We 
claim that M* is the set S of all the sequences in J*° where every odd term is 0. It 
is clear that ifA = (A,,) € S, then, for every x = (x,,) € M, ae x,Ay = 0; hence 
SC M+. Conversely, if A =(A,) € M+, then A(e,4,) =0 for every positive 
integer n. This means that A5,,, =0, and A € S. Observe that M+ is a closed 
subspace /°, consistent with the theorem below. 


Theorem 6.6.4. M+ is a closed subspace of X*, and N, is a closed subspace of X. 
Proof. For x € M, xt = {a © X* : A(x) = 0} is a closed subspace of X* because 
x+ = Ker(%). Consequently, M+ = N,cyx" is a closed subspace of X*. Similarly, 


N, =NyenKer(A) is a closed subspace of X. Hl 


Theorem 6.6.5. Let T € £(X), let N(T) and R(T) be the kernel and range of T, 
respectively, and let N(T*), and R(T*) be the kernel and range of T*. Then 


(a) N(T) = RT"). 
(b) N(T*) = R(T). 


Proof. (a) x € N(T) if and only if Tx = 0, if and only if (Tx,A) = 0 for alla € X", if 
and only if (x, T*A) = 0 for all A € X*, if and only ifx € R(T"). 


(b) AE N(T") if and only if T*A = 0, if and only if (x, T*A) = 0 for all x € X if 
and only if (Tx,A) = 0 for every x € X, if and only ifA € R(T). 


Quotient Spaces 


Let M be a closed subspace of a normed linear space X. We define a norm on 
the quotient space X/M as follows: for % = x + M € X/M, ||X|| = inf{||x — yl] : y € 
M} = dist(x,M). We leave it to the reader to verify that the norm we just defined 
on X/M is well defined and that ||X|| = 0 if and only if = 0. The triangle inequality 
is the only norm property yet to be verified. Let x,,x, €X, and y,,y. EM. 
Since y, +y. EM, dist(x, + x2,M) < ||) + x2) — GQ +y2)II < [lx — 1] + oo — 
y||. Because the last inequality is valid for arbitrary elements y, and y, of M, 
dist(x, + x2,M) < dist(x,,M) + dist(x,, M), that is, ||%, + X2|| < ||%,|| + ||Xll- 
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Remarks. 1. If ||X|| < 6, then there exists € € X such that ||&|| < 6. This is because 
if ||X|| < 6, then there exists y € M such that ||x — y|| < 6. Set §€ =x—y. 
2. For x € X, ||X|| < ||x||. This is because 0 € M; hence ||x|| = ||x — 0]| > ||X||- 
3. It follows directly from remark 2 that if (x,,) converges to x in X, then (x,,) 
converges to X in X/M. In particular, if the series sae &, converges in X, 
then ))”, &, converges in X/M. 


Theorem 6.6.6. Let M be a closed subspace of a Banach space X. Then X/M is a 
Banach space. 


Proof. We use the result of problem 10 on section 6.1. Suppose pe ||X,,|| < co. We 
prove that ee X,, converges in X/M. By remark 1, for everyn EN, there exists an 
element £,, € %,, such that |[E,l| <[[%qll + 1/2". Now O°, |IEull $ Ee lal + 
1/2") =1+ pen ||X,|| < 00. By the completeness of X, the series })_, §, con- 
verges in X, and, by remark 3, 9). Gn = Dun=1 Xn converges in X/M. fl 


Theorem 6.6.7. Let M be a closed subspace of a Banach space X. Then 


(a) (X/M)* is isometrically isomorphic to Mt. 
(b) X*/Mt+ is isometrically isomorphic to M*. 


Proof. (a) Define a map 6 : Mt > (X/M)* by 6: A 6), where 5;,(x+M) = 
6y(x) = A(x); 6 is onto since if u € (X/M)*, define a functional A : X > C by 
A(x) = "(X). It is easy to see that A € X*, AE M+, and 5, = yu. To show that 6 
is an isometry, first notice that ||x|| < ||x|| and, by remark 1, if ||x|| <1, there 
exists an element x € X such that ||x|| <1. Therefore ||6,|| = supyx<1|6,&)| = 
SUP }<1|A@)| = |All. 


(b) Let 4 © M*. By the Hahn-Banach theorem, | has an extension A € X*. 
Define a mapping o : M* > X*/M* by o,=A+M?; o is well defined 
because if A and 2’ are bounded extensions of fu, then (A —2')(M) = 0; hence 
A-2 EM, and A+M+*=2' 4M". The linearity of o is obvious and since 
the restriction of any A € X* to M is in M*, o is onto. It remains to show 
that ||o,.|| = ||K||. Observe that o, is the collection of all bounded extensions 
of &. Thus, by the definition of the quotient norm, ||o,\| = infi||A||}, where A 
is a bounded extension of fu. Since, for any such A,||q|| < ||Al|, it follows that 
Hl < |loul| < ||Al|. The Hahn-Banach theorem also guarantees an extension A 
for which ||A|| = ||fl|- Thus [lou ll = ||HII- 
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Exercises 


. Show that if T,S € £(X) and a, and bare scalars, then (aT + bS)* = aT* + 


bS*. Conclude that if X is reflexive, then the correspondence V : Tt T* is 
an isometric isomorphism from £(X) to £(X*). 


2. If T and S are as in the above exercise, show that (ST)* = T*S*. 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


. Let X bea Banach space. Prove that X+ = {0}, {0} = X*. State and prove the 


corresponding statements for X*. 


. Let M bea subspace ofa Banach space X. Prove that (M+), = M. Hint: Use 


theorem 6.4.5 to show that if x ¢ M, then x ¢ (M*),. 


. Let X be a Banach space, and let T € £(X). Prove that R(T) = N(T*),. 


Conclude that R(T) is dense if and only if T* is one-to-one. 


. Let X be a Banach space, and let T € £(X). Show that if x, >” x, then 


Tx, 2” Tx. 


. Let S and T be commuting bounded linear operators on X. Prove that the 


eigenspaces of T are S-invariant. 


. Let T € L£(X), and suppose M is a T-invariant subspace of X. Prove that Mt 


is invariant under T™. 


. Verify the details of the proof that the norm defined on X/M is indeed a 


norm. 
Show that if M is a closed subspace of a Banach space X, then the quo- 
tient map 7 : X > X/M is continuous. Also prove that if N is a finite- 
dimensional subspace of X, then 77(N) is a finite-dimensional subspace of 
X/M. 

In the quotient space /°°/co, prove that ||x|| = lim sup, |x,,|. Hint: For ¢ > 0, 
there are finitely many n € N such that |x,,| > limsup,, |x,| + €. 

Let X be a Banach space, and let T € £(X). Define fie X/Ker(T) — R(T) 
by T(X) = T(x). Prove that T is a bounded isomorphism. Hint: To show 
the continuity of 7 suppose X,, > 0, and choose x, € X,, such that ||x,]|| < 
[ll + /n. 

Let R be the right shift operator on ’, let M, be the range of R, and let M, be 
the range of R?. Determine the quotient spaces [?/M, and ?/M,. Conclude 
that if M, and M, are isomorphic closed subspaces of a Banach space X, 
then it is not necessarily true that X/M, and X/M, are isomorphic. 

Prove that if M is a closed subspace of a separable Banach space X, then 
X/M is separable. 

Let Mbe a closed subspace of a Banach space X. Prove that if X* is separable, 
then so is M*. 

Let M be a closed subspace of a Banach space X. Prove that if M and X/M 
are separable, then X is separable. Hint: Let {x,,} C X be such that {X,,} is 
dense in X/M, and let {y,,} C M be dense in M. 
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17. Let X be a normed linear space, and let M be a closed subspace of X. Prove 
that if M and X/M are Banach spaces, then so is X. 

18. Let M be a closed subspace of a Banach space X, and let N be a finite- 
dimensional subspace of X. Show that M+N is closed. Hint: Consider 
m~'((N)), where 2 : X > X/M is the quotient map. 

19. Let M be a complemented subspace of a Banach space X. Show that M+ is 
complemented in X*. Hint: Let P be the projection of X onto M, and let 
N= Ker(P). Show that M+ = Ker(P*) and that Nt = R(P*). 

20. Define a linear operator T € L(cg) as follows: for x = (x,), T(x) = (). 
Describe T*. Recall the result of problem 16 on section 6.2. " 


6.7 Weak Topologies 


The weak topologies are defined in much the same way the product topology 
is defined. They are designed to guarantee the continuity of a certain class of 
functions. We urge the reader to look up theorem 5.4.1, the definition of the 
product topology in section 5.12, and theorem 5.12.1. This section is terminal and 
may be omitted without loss of continuity. 


Definition. Let X be a normed linear space. The weak topology on X is the 
smallest topology relative to which all the bounded linear functionals on X are 
continuous. We use the abbreviation w-topology for the weak topology on X. 


Definition. Let X be a normed linear space, and let X* be its dual. The weak* 
topology on X* is the smallest topology on X* relative to which the functionals 
x are continuous. Here % is the image of x € X under the natural embedding of 
X into X**. We use the abbreviation w*-topology for the weak* topology on X*. 


Notice that the definitions of the w- and w*-topologies are asymmetric. Only the 
functional on X* of the form % is admitted in the definition of the w*-topology on 
X*. Thus if X is not reflexive, then the functionals in X** — X are not guaranteed 
to be continuous in the w*-topology, and indeed they are not. See theorem 6.7.6. 


In order to eliminate any potential confusion, we specifically refer to the topology 
generated by the norm on a space X (or its dual X*) as the norm topology on X 
(or X*). The norm topology is also referred to as the strong topology. We denote 
the closed unit balls of a normed linear space X and its dual X* by B and B*, 
respectively. We use notation such as (B*,w*) to indicate the closed unit ball of 
X*, when it is endowed with the w*-topology. 
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It follows directly from the definitions that an open base for the w-topology is 
the collection of sets of the form Nj_){x € X : |A;(x) —A;(x)| <r}, where r> 0, 
Xo © X, and {A),...,A,,} is a finite subset of X*. Similarly, an open base for the w*- 
topology is the collection of all sets of the type NL, {A € X* : |AC&) —Ap(x)| < rh, 
where r > 0, Ay € X*, and {x),...,x,} is a finite subset of X. 


In the exercises in section 6.4, we introduced the notion of a weakly convergent 
sequence in a normed linear space. We now reconcile this concept with the defi- 
nition of the weak topologies. First recall the definition of a convergent sequence 
in a topological space. 


Definition. A sequence (x,,) in a topological space X is said to converge to a point 
x € X if every open neighborhood of x contains all but finitely many terms of 
the sequence (x,,). Thus if U is an open neighborhood of x, then there exists a 
natural number N such that x, € U for every n> N. 


Theorem 6.7.1. 
(a) A sequence (x,,) converges to x in the w-topology on a normed linear space X 
if and only if lim, A(x,) = A(x) for every A € X™. 
(b) A sequence (A,) converges to A in the w* -topology if and only iflim,A,(x) = 
A(x) for every x EX. 


Proof. We prove part (b). Let 2,, and Ay be such that lim,,A,,(x) = A(x) for every 
x€X. We show that A, converges to Ay in the w*-topology. If U is a w*- 
open neighborhood of Ag, then there exists r>0 and a finite subset {x),...,Xn} 
of X such that NZ ,{A © X* : |A(x;) —Ao(x))| < 7} CU. Since for alll <i<m, 
lim,,A,,(x;) = Ao(%;), there is a natural number N such that |A,,(x;) —Ao(x)| <r 
for alln>WN and all 1<i<m. This means that 1, ENjLj{A © X* : |A(x) — 
Ay(x;)| <r} C U, for every n> N. The proof of the converse is a partial reversal 
of the above argument. 


Theorem 6.7.2. Let X be a finite-dimensional normed linear space. Then 


(a) The w-topology and the norm topology on X coincide. 
(b) The w*-topology and the norm topology on X* coincide. 


Proof. We prove part (b). We show that if U = {A € X* : ||A —Ag|| <r}, then U con- 
tains a w* -neighborhood V of Ao. Let {e,,...,€,} be a basis for X, and define anorm 
on X* by |All’ = maxy<jcn|A(e;)|. Since all norms on X* are equivalent, there 
exists > 0 such that ||A||' < 6 implies that ||A|| < r. The w*-open neighborhood 
VENA €X* : |A(e;) —Ap(e;)| < 5} of Ay is contained in U. 
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Before we can prove that the w-topology is different from the norm topology on 
an infinite-dimensional normed linear space, we need a fact from general vector 
space theory. 


Lemma 6.7.3. Let U, V, and W be vector spaces, andleta : U> Vandg : U> W 
be linear mappings such that Ker(z) € Ker(¢). Then there exists a linear mapping 
py: V— Wsuch that pon = ¢. 


Proof. Let V; = R(z), and define p : V, > W by (z(x)) = v(x). The condition 
Ker(z) € Ker(y) guarantees that i is well defined. Let V, be an algebraic 
complement of V, in V, and extend the definition of ~ to V by W(v) = 1, where 
v=v,+, and v, € V;. By construction, pon = 9. 


Lemma 6.7.4. Let X be a vector space, and let A and A,,...,A,, be linear functionals 
on X. If Ny Ker(A;) € Ker(A), then A is a linear combination of A,,...,2y. 


Proof. Definea : X > KK" byz(x) = (A, (x), ...,A,(x)). The condition Nj_, Ker(A;) 
Ker(A) implies that Ker(z) € Ker(A). The previous lemma produces a functional 
wp : K" > K such that pox =A. Because pp is linear, there exist scalars a,,...,ay, 
such that for (vy,...,V_) € K", POY), ..5%) = Wie Vi Now, for x € X, A(x) = 


(por )(x) = PA (%), Aux) = Yj, GA). 


Theorem 6.7.5. A weakly open subset U of an infinite-dimensional normed linear 
space X is unbounded. 


Proof. Without loss of generality, we assume that 0 € U. Then there is r>0 and 
a finite subset {A,,...,2,} of X* such that ni_\{x € X : |Aj(x)| <r} C U. ‘The set 
N=nii)Ker(A,) is clearly contained in U. If N = {0}, then, for every A € x", 
NC Ker(A). By lemma 6.7.4, every A € X* would be a linear combination of 
A,,..,A,, contradicting the assumption that X, hence X*, is infinite dimensional. 
Thus N # {0}, and, for any nonzero x EN, the line {cx : cE R} CN; hence U is 
unbounded. @ 


The above theorem implies that the weak and norm topologies on an infinite- 
dimensional space X are distinct since no open bounded subset of X can be 
weakly open. 


Weak topologies are generally intricate, and good caution must be exercised when 
formulating arguments involving them. In metric topologies, when one speaks 
of an open neighborhood of a point x, one instinctively thinks of an open ball 
centered at x. A w-open neighborhood of a point looks nothing like an open ball 
since bounded subsets of X are never w-open. 
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We now prove that the w*-topology is very tight in the sense that it admits 
the continuity of no linear functionals other than the functionals * used in the 
definition of the w*-topology. 


Theorem 6.7.6. Let X be a Banach space, and let F be a w*-continuous linear 
functional on X*. Then F = X for some x € X. 


Proof. Let D be the open unit disk in the complex plane. By the w*-continuity of 
F, F~\(D) is w*-open; hence it contains a w*-neighborhood of 0 of the form 
U=Niiy {A € X* : |A(x;)| <r} for some r > 0, and some finite subset {x,,...,x,} 
of X. In particular, F(U) is a bounded subset of the complex plane. We show that 
Ni Ker(%;) € Ker(F). If A € Ni, Ker(X;), then, clearly, A € U and cA € U, for 
all c € R; hence |c||F(A)| = |F(cA)| is bounded for all c € R. This forces F(A) = 0. 
Therefore Nj, Ker(%;) € Ker(F). By lemma 6.7.4, F is a linear combination of 
1, ..,X,3 hence F € X. 


Theorem 6.7.7 (the Banach-Alaoglu theorem). Let X be a normed linear space. 
Then B* = {2 € X* : ||A|| < 1} is compact in the w*-topology. 


Proof. For each x EX, let D,={zEC: |z| <||x||}, and let D=JJ-,D,. By 
Tychonoff’s theorem, D is compact. For each A € B*, define fy € Dby fy(x) = A(x). 
Since, for x € X, |fa(x)| < |IA|lllx|l < |lxl], A@) € D, and, indeed, fy € D. The 
function f : B* > D given by A & fy clearly injects B* into D. For the rest of the 
proof, we identify A and f, and consider B* as a subset of D. The w*-topology 
on B* is the restriction of the product topology on D to B*. The proof will 
be complete if we show that B* is closed in D. Let 4 € D be a closure point 
of B*. We need to show that © B*. Fix a pair of points x,y € X, and let 
€>0. The D-open set {gED: |g(x)—u(x)| < €/3, |e(y) — uly) < €/3, lea + 
y) — U(x + y)| < €/3} intersects B*, so there exists an element A € B* such that 
|A(x) — u(x)| < €/3, |AG”) — uLy)| < €/3, and |A(xt+y)— uxt y)| < €/3. Since 
Ax +y) — A(x) — Aly) = 0, 


|e(x + y) — HO) — HO)| = [HO +) — HO) — BO) —AG + y) FAG) +40) 
<|A(xt+y)— eet y)| + AG) — uO) + AY) - HQ) <e. 
Since € is arbitrary, u(x + y) = U(x) + u(y). The homogeneity of tt is proved using 
a similar argument. Finally, 4 € B* because (x) € D,, so |u(x)| < ||x||, which 
means that 1 is bounded and |||| < 1. 


The following theorem is curious if not very practical. 
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Corollary 6.7.8. Every Banach space X is isometrically isomorphic to a closed 
subspace of C(K) for some compact Hausdorff space K. 


Proof. By the Banach-Alaoglu theorem, the space K = (B*,w*) is compact. Define 
a function F : X > C(K) by F(x) = %; F is a linear isometry since ||F(X)||o = 
sup{|X(A)| : A © B*} = ||| = ||x||. (Note: The norm on C(K) is the supremum 
norm.) Thus X is isometrically isomorphic to F(X), which is a closed subspace 
of C(K). 


Theorem 6.7.9. Let X be a separable Banach space. Then (B* ,w*) is metrizable. 


Proof. Let {x,} be a dense subset of B. For A, € B*, define 
d(A,W) = 1 2-*AG,) — Hod 
n=1 


If d(a, u) =0, then (A — )(x,) =0 for all n. The density of {x,} forces A = mL. 
The other defining properties of a metric are easy to verify. We show that the 
identity function I: (B*,w*) — (B*,d) is a homeomorphism. It follows that 
(B*,w*) is metrizable. Since (B*,w*) is compact and (B*,d) is Hausdorff, it 
suffices to show that Tis continuous. See theorem 5.8.7. To this end, we prove that a 
d-open ball U={A € B* : d(A,Ay) <r} contains a w*-open neighborhood V 
of Ao. Because |A(x;) —Ap(x;)| <2 for all A © BY and all iE N, there exists 
a positive integer N such that such that Drewes 2 AGG) — Ao(x)| <r/2 for 
all A € B*. For every A € B* satisfying |A(x;)—Ag(x;)|<1/2 for all 1<i<N, 
we have d(A,Ay)= ye 27A(x;) —Ap(x)| + a 27A(x;) —Ap(x;)| < 17/2 + 
r/2. Thus the w*-neighborhood V = Ni, {A € B* : |A(x;) —Ao(x;)| < 1/2} of Ag is 
contained in U. @ 


Corollary 6.7.10. If X is separable, so is (X*,w*). 


Proof. By theorems 6.7.7 and 6.7.9, (B*,w*) is compact and metrizable; hence, it is 
separable. Since X* = Ufr_,nB*, X* is separable in the w*-topology. 1 


The converse of theorem 6.7.9 is also true. Recall (see theorem 4.9.10) that if K is 
a compact metric space, then C(K) is separable. 


Theorem 6.7.11. Let X be a Banach space. Then (B*,w*) is metrizable if and only 
if X is separable. 


Proof. Suppose (B*, w*) is metrizable. Theorem 6.7.7 implies that K = (B*,w*) is a 
compact metrizable space; hence C(K) is separable. The mapping F : X > C(K) 
defined by F(x) = X is an isometry. Hence X is separable because it is isometric to 
F(X), which is separable, being a subspace of a separable metric space. Mi 


PWN 


10. 
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Exercises 


. Prove that the w- and the w*-topologies are Hausdorff. 

. Complete the proof of theorem 6.7.1. 

. Complete the proof of theorem 6.7.2. 

. Let X be an infinite-dimensional normed linear space. Prove that every w- 


neighborhood of 0 contains an infinite-dimensional subspace of X. Hint: 
Examine the proof of theorem 6.7.5, and see problem 19 on section 3.4. 


. In connection with corollary 6.7.8, prove that F(X) is a closed subspace 


of C(K). 


. Let M be a subspace of a normed linear space X. Prove that the w-closure 


of M is a subspace of X. 


. Prove that every norm-closed subspace M of a Banach space X is w-closed. 


Conclude that the w-closure and the norm-closure of a subspace of X 
coincide. 


. Prove that a Banach space is separable if it is weakly separable. 
. Prove that if X a Banach space such that X* is separable, then (B,w) is 


metrizable. 
Let K be a compact subset of a Banach space X. Prove that the weak and 
norm topologies on K coincide. Hint: See problem 6 on section 5.8. 


7. 
Hilbert Spaces 


Wir miissen wissen. 
Wir werden wissen. 
David Hilbert. 1862-1943 


David Hilbert. 1862-1943 


Upon graduation from the Wilhelm Gymnasium, where he spent his final year of 
schooling, Hilbert enrolled at the University of Konigsberg in the autumn of 1880. 
He received his Ph.D. from K6nigsberg in 1885, remained there as a member of 
staff from 1886 to 1895, and was promoted to the rank of professor in 1893. In 1895 
Hilbert was appointed to the chair of mathematics at the University of Gottingten, 
where he spent the rest of his career. Among Hilbert’s numerous students were 
Hermann Weyl, Felix Bernstein, Otto Blumenthal, Richard Courant, Alfred Haar, 
and Hugo Steinhaus. 


Hilbert contributed to many branches of mathematics, including geometry, 
algebraic number fields, functional analysis, integral equations, mathematical 
physics, and the calculus of variations. Hilbert’s work in geometry had the greatest 
influence in that area after Euclid. A systematic study of the axioms of Euclidean 
geometry led Hilbert to propose twenty-one such axioms, and he analyzed their 
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significance. He published Grundlagen der Geometrie in 1899, putting geometry 
in a formal axiomatic setting. Hilbert is most remembered for studying infinite- 
dimensional Euclidean spaces, which are now known as Hilbert spaces. 


Hilbert’s famous twenty-three Paris problems challenged (and still today 
challenge) mathematicians to solve fundamental questions. Hilbert’s famous 
speech The Problems of Mathematics was delivered to the Second International 
Congress of Mathematicians in Paris. It was a speech full of optimism for 
mathematics in the coming century, and he felt that open problems were the sign 
of vitality in the subject. Hilbert’s problems included the continuum hypothesis, 
Goldbach’s conjecture, and the Riemann hypothesis. 


Hilbert’s mathematical abilities were nicely summed up by Otto Blumenthal, his 
first student: 


In the analysis of mathematical talent one has to differentiate between the ability to 
create new concepts that generate new types of thought structures and the gift for 
sensing deeper connections and underlying unity. In Hilbert’s case, his greatness 
lies in an immensely powerful insight that penetrates into the depths of a question. 
All of his works contain examples from far-flung fields in which only he was able 
to discern an interrelatedness and connection with the problem at hand. From 
these, the synthesis, his work of art, was ultimately created. Insofar as the creation 
of new ideas is concerned, I would place Minkowski higher, and of the classical 
great ones, Gauss, Galois, and Riemann. But when it comes to penetrating insight, 
only a few of the very greatest were the equal of Hilbert. 


Hilbert retired in 1930, and the city of Kénigsberg made him an honorary citizen. 
He gave an address which ended with famous words that now appear on his 
epitaph: 


Wir miissen wissen, wir werden wissen: We must know, we shall know.’ 


7.1 Definitions and Basic Properties 


Let {u1, u,...} be an infinite orthonormal sequence of vectors in an inner prod- 
uct space H, and let x € H. In the introduction to section 4.10, we posed the 
following problem. Under what conditions does the sequence of orthogonal 
: : n noo»z : : : 
projections, S,,x = )),_ (x, uj)u; = )),_, Xiu; of x on the finite-dimensional space 


* Perhaps as a rebuttal of Du Bois-Raymond’s statement “we do not know and will not know, 
reflecting the idea that scientific knowledge is unknown and unknowable. 
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M,, = Span({u,,...,u,}), converge to x. Regardless of whether S,,x converges to 
x, it is a Cauchy sequence. To see this, recall the result of problem 5 on section 
3.7 (also see theorem 7.2.6,) which states that ie |X,|7 < 00. Now, for m>n, 
Sm — Sell? S Oe ae 
because it is the middle section of the convergent series yDiale |;|?. Thus we have a 
sufficient condition for the convergence of the sequence S,,x: the completeness of 
H. This is exactly the definition of a Hilbert space. The completeness of H merely 
guarantees the convergence of S,,x. It does not guarantee that lim, S,x = x, as the 
following situation illustrates. If u € H is unit vector orthogonal to each u,,, then 
S,u = 0 for all n € N; hence lim, S,u = 0 # u. To remedy this situation, one may 
want to impose the condition that no such vector u exists. Equivalently, this means 
that the sequence {uw ,u,...} is a maximal orthonormal subset of H, and this is 
precisely the definition of a countable orthonormal basis for H. Hilbert spaces 
and orthonormal bases are the subject of our study in this section and the next. 
The question about the smallest Hilbert space H in which trigonometric series 
of functions in H converge will be settled in section 8.9, together with related 
questions pertaining orthogonal polynomials. It is strongly recommended that you 
study sections 3.7 and 4.10 before you tackle this chapter. 


|x;|’. The sum in the last expression tends to 0 as n > oo 


Definition. A Hilbert space is a complete inner product space. 
Example 1. The spaces K” and I? are Hilbert spaces. @ 


Example 2. The space (IK(N), ||.||,) is not a Hilbert space. We use the fact that a 
subspace of I? is complete if and only if it is closed. Now K(N) is not closed 
in I? because it contains the sequence x, = (1,0,0,...),x, = (1, 1/2,0,0,...), «5 
x, = 1, 1/2, 1/3,...,1/n,0,0,...) The limit of the sequence (x,,) is the harmonic 
sequence x=(1,1/2,...,1/n,...) because ||x,—x||3 = papier |xj? +0 as 
n— oo. Clearly, x ¢ IK(N). @ 

For ease of reference, we state, without proof, a few results from section 3.7. We 

urge the reader to look up the proofs and the basic definitions in section 3.7. 


Theorem 7.1.1 (the Cauchy-Schwarz inequality). If H is an inner product space, 
then, for all x,y € H, |(x,y)| < ||x|||ly||. Equality holds if and only if x and y are 
linearly dependent. 


Corollary 7.1.2. In an inner product space H, ||x+~yl| <||x||+|ly||. Here 
\|x|| = (x,x)"/? is the norm on H induced by the inner product. 


Theorem 7.1.3. Let x and y be elements of an inner product space H. 
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(a) The Pythagorean theorem. If x and y are orthogonal, then 

lle + yll? = [lx1? + IlyIP. 
(b) The Parallelogram law. 
lle + yll? + [lx — yl]? = 2|lex]? + 2IlyI/. 
(c) The Polarization identity. 
llx + yll? = [lx — yl]? + ill + iy1l? — illx — iy? = 400.) 

Proof. We leave the proof as an exercise. 


Not all norms are induced by an inner product. However, we have the following 
result, which we limit to real normed linear spaces for simplicity. 


Example 3. Suppose that (X, ||.||) is a real normed linear space and that the norm 
satisfies the parallelogram identity. Then the function 


1 
(x,y) = =[llx+ yl? = Ie —yIP] 
is an inner product that generates the norm. 

It is clear that (x,x) = ||x||?, thus establishing the positivity of the function 
(.,.) and that it generates the norm. The symmetry of (.,.) is obvious. Next we 
establish the linearity of (.,.). 

We leave it to the reader to use the parallelogram identity to show that 

[let y+ 2]? + [loll? + Iv? + lel? = let yl? + My + 2? + [let zi. 
Replacing z with —z in identity (1), we have 

IIx + y= all? + ell? + Hy? + [lal = Ile + yl? + ly — 21? + Ile ZIP.) 
Subtracting (2) from (1), we obtain 

lx + y+ 2||? = [x+y —2]P? = [lx + 21]? = [le 21? + lly +21? — Ilya? 


which is equivalent to (x + y,z) = (x,z) + (y,z). 


Using the linearity property we just established and induction, it follows 
that, for mEN, (mx,y)=m(x,y). Since (—x,y)=—(x,y), the identity 
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(mx, y= m{x, y) holds for all m EZ. Using this, for all nEN, (x,y) = 
(n= -x,y) = n{~ -X,y). Equivalently, (~ xy= ~(x, y). 

We have shown that, for all q ra Q, (qx, y= q{x,y). It is easy to see that if 
lim, x, = x, then lim,,(x,, y) = (x, y). Now the homogeneity property, (ax, y) = 
ax, y) for a € R, is immediate because Q is dense in R. @ 


Definition. For a subset A of an inner product space H, the annihilator of A is 
the set of all vectors that are orthogonal to every element in A. Symbolically, 
= {x EH: (x,a)=0 Va E A}. 


Example 4. Consider the space C[—z,z] with the inner product (fg) = 
ss fe fodg(x)dx. Let M be the set of functions in C[—7, 7] that vanish on the 
interval [—7r,0], and let N be the set of all functions in @[—7, | that vanish on 
the interval [0,7]. Every function in M is orthogonal to every function in N. 
Thus NC Mt andMCN*.@ 


Theorem 7.1.4. For subsets A and B of an inner product space H, 


(a) AC At; 

(b) if A CB, then At D Bi; 

(c) A~ is a closed subspace of H; and 
(d) A+ = M+, where M = Span(A). 


Proof. (a) and (b) are obvious. 
(co) Since At = peya" it is  enciigh to prove that a+ is a closed subspace of 
H. If a,B EK and x,y Eat, then (ax+ By,a) = a(x,a) + B{y,a) =0. Thus 
a" is a subspace of H. If x, € a+ and lim,,x, =x, then (x,a) = (lim, x,,4) = 
lim,,(x,,4) = 0. The continuity of the inner product in its arguments has been 
used here. The proof of part (d) is left as an exercise. 1 


Definition. If M is a closed subspace of a Hilbert space H, the closed subspace M+ 
is called the orthogonal complement of M (rather than the annihilator of M.) 
The reason for the above terminology will become apparent in theorem 7.1.7. 


Example 8 in section 4.7 is a very special case of the theorem below. Observe that 
the completeness of H is crucial here. 


Theorem 7.1.5. Let C be a closed convex subset of a Hilbert space H, and let 
x € H. Then there exists a unique element y € C such that ||x — y|| = dist(x,C) = 


infzec||x — 2|l- 
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Proof. If x EC, take y=x. If x E CG, then 6 = dist(x,C) > 0 because C is closed. 
There exists a sequence (y,,) in C such that lim,, ||x — y,,|| = 6. By the parallelogram 
law, 


Yn —Ymll? = ln =X) _ OF — x)|I? 5 2|\Vn = alle + 2||¥n =%||? 
S| Nee oe ta | 


Now |l¥n—X—Ym — XII? = 4llx— Coe > 46°. The last inequality is true 
because a € C due to the convexity of C. Thus 


In —Ymll? S Alyn — IP? + 2ll¥m — x1? — 46° > 0 as m,n > 00. 


This shows that (y,,) is a Cauchy sequence, and hence y = lim,, y,, exists. Since C is 
closed, y € C. Now 6 = lim,, ||x — y,,|| = ||x — y||, and y is one of the closest points 
in C to x. To show that y is unique, suppose z € C is such that ||x — z|| = 6. By the 
parallelogram law, and as in the calculation above, 


lly — 21? = 2|ly — x||? + 2||x — Z|]? — [ly + z— 2x||? 
+z 
= 26? + 26? — 4l|x — yr < 26? + 262-462 =0. 8 


Corollary 7.1.6. If C is a closed convex subset of a Hilbert space H, then C contains 
a unique element of smallest norm. 


Proof. Apply the above theorem with x = 0. 


Theorem 7.1.7 (the projection theorem). Let M be a closed subspace of a Hilbert 
space H. Then H= M@ Mt, where M+ is the orthogonal complement of M. 


Proof. Let x € H, let y be the closest element of M to x, and let z= x—y. Write 
5 = dist(x,M) = ||z||. We show that z€ M+. Let wE M and, without loss of 
generality, assume that ||w|| = 1. For any a € K,y + aw € M, so 


8? <||x -y—-—aw||? = ||z- aw)? = (z-aw,z— aw) 
= |lz||? — aw, z) — (zw) + |or|?|| mI? 
= 6” — 2Re(a(w,z)) + |a|?. 


Therefore 2Re(a(w, z)) < |a|*. Since the above is true for an arbitrary a, choose 
a = (z,w). We now have 2|(w,z)|? < |(w,z)|?; hence (w,z) = 0. The proof is now 
complete because Mn M+ = {0}. 
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As an immediate consequence of the above theorem, every element x € H can be 
written uniquely as x = y +z, where y € Mandze Mt. 


Example 5. Let M be the set of all sequences in ? whose even terms are zero, and 
let N be the set of all sequences in /? whose odd terms are zero. It is easy to see 
that M and Nare closed subspaces of /? and that N= M-. Trivially, every vector 
x =(x,) € P can be written as x = (x,,0,x3,0,...) +(0,x,,0,x4,...) EMON. @ 


Definition. The element y in theorem 7.1.7 is called the orthogonal projection 
of x on M. It is worth reiterating that y is the closest element in M to x. The 
mapping Py : H > H defined by Py,(x) = y is called the projection operator 
(or simply the projection) of H onto M. 


Theorem 7.1.8. Let M be a closed subspace of a Hilbert space H, and let P = Py, be 
the projection of H on M. Then 


(a) P is bounded and ||P\| = 1, 
(b) R(P) = M, and 
(c) P =P. 


Proof. (a) Let x,x’ € H, and let x =y+z, and x' =y' +2’, where y,y’ € M and 
z,z' © M*. Thenx+x' =(y+y')+(z+2’). Since y+y' €Mandz+z EM", 
the uniqueness of the orthogonal projection of x + x' on M forces P(x+x')=y+ 
y’ = P(x) + P(x’). The proof that P(ax) = aP(x) is similar. Now ||x||? = |ly||? + 
\|z||?; thus ||P(x)|| = |lyl| < |lx||, so ||P|| <1. Since P(x) =x for every x € M, 
||P|| = 1. This proves (a). Parts (b) and (c) are obvious. 


The following theorem gives a complete and simple characterization of the dual 
of a Hilbert space. The Riesz representation theorem basically says that a Hilbert 
space is isometrically isomorphic to itself in a very natural way. 


Theorem 7.1.9 (the Riesz representation theorem). Let A be a bounded linear 
functional on a Hilbert space H. Then there exists a unique element y € H such 
that A(x) = (x,y) for all x € H. Furthermore, ||A|| = ||y|l- 


Proof. If A =0, take y= 0. Otherwise, let M = Ker(A); M is a closed subspace of 
H because M=A~'(0), and M# H because A #0. By the projection theorem, 
H=M@M‘. Pick a nonzero element z € M+. Then A(z) 0, and, by replacing 
z with z/A(z), we may assume that A(z) = 1. Forx EH, x =x—A(x)z+A(x)z. It 
is easy to verify that w= x—A(x)z € M. 
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Observe that (x,z) = (w,z) + (A(x)z,z) = A(x)||2||?. Define y = z/||z||°. Then, 
by the above identity, A(x) = = 
that there is another element y, € H such that (x,y) = (x,y,) for all x © H. Then 
(x,y —) = 0 for all x € H. Choose x = y— y,. Then ||y — y,||? = 0; hence y = y,. 
Finally, \A(x)| = |(x,y)I S |lalllyl|- Thws [A] < IlyIL- 

Also |A()| = [(y.y)I = Iv? = Ibllliyll. This shows that |[Al| > |ly|| and that 
IIA] = Ill] 


= (x,y). To prove that y is unique, suppose 


Recall that a hyperplane in R” is nothing other than the translation of the null- 
space ofa linear functional on R”, that all linear functionals on R” are continuous, 
and that all maximal subspaces are closed. In infinite dimensions, the null-space 
ofa linear functional A is closed if and only if A is continuous. The following result 
is the exact analog of example 10 in section 4.7. 


Example 6. Let C be a closed convex subset of a real Hilbert space H, and let 
a € H—C. Then there exists a bounded functional 2 on H and a constant b 
such that A(y) < b for every y € C, and A(a) > b. 


The obtuse angle criterion extends to the current situation, and the proof is 
identical to that in example 9 in section 4.7. Thus if z is the closest element 
in Cto a, then, for every y € C, (a—z,y—z) <0. Let m = (a + z)/2, and define 
n=a-—Z, X(x) = (x,n), and b = A(m). As in example 10 in section 4.7, we may 
assume that m = 0; hence b = 0. It is easy to verify that A(y) < 0 for all ye C 
and that A(a) > 0. @ 


The Completion of an Inner Product Space. 


Example 7. If (x,,) and (y,,) are Cauchy sequences in an inner product space, then 
lim, (XnVn) exists. 
We prove that the sequence (x,,,y,,) is Cauchy in C; hence the limit in question 
exists. Recall that Cauchy sequences are bounded. Now 


XneVn) ~~ (Xn Vind | = x, me, + isi —Ymn| 
S [ln —Xralll Yall + [lemllll¥n — Ymll > 0 as m,n > co. @ 


Theorem 7.1.10. Let (X,{.,.)) be an incomplete inner product space. Then there 
exists a Hilbert space H that contains X as a dense subspace such that the inner 
product on X is the restriction of the inner product on H. If X is separable, so is H. 
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Proof. Let ||.|| be the norm on X induced by the inner product, and let H be the 
completion of X with respect to the norm on X (theorem 6.4.9). Refer to the 
extended norm by ||.||'. For x,y © H, choose sequences (x,) and (y,) in X such 
that x, > x and y, > y, and extend the definition of the inner product to H by 
(x, yy’ = lim,(x,,V,). We leave it to the reader to verify that the inner product 
we just defined is well defined and that it is indeed an inner product. Clearly, the 
inner product on H extends that on X. Finally, we prove that that the extended 
inner product induces the extended norm on H. For a sequence (x,) converging to 


x €H, (x,x)! = lim,(xy,%q) = lim, |||? = lim, (Jaull!) = ([lxl|’) = 


Exercises 


. Prove that the norm on C[0, 1] generated by the inner product 


(f.2) = [ fdg dx 


is not complete. 


2. Prove the parallelogram law and the polarization identity. 


10. 


. Let x and y be nonzero vectors in an inner product space. Prove that there 


Re(x,y) 
IlxllllyII 


. Conclude that 


exists a unique number 0 < @ < z such that cos 0 = 


llx + yll? = [lell? + ILyll? + 2IlellbyIl cos 8. 


. Prove the Apollonius identity: For vectors x,y, and z in an inner product 


1 x+y 
space, ||z— 2]? + |lz— yl? = =Ilx—yll? + 2lle— =P. 


. Let A be a subset of a Hilbert space H, and let M = Span(A). Prove that 


At=mt. 


. Let M be a closed subspace of a Hilbert space. Prove that M= M1. Give 


an example to show that the result fails if M is not closed. More generally, 
show that Mt+ = mM. 


. Show that if A is a subset of a Hilbert space H, then A‘+ is the smallest 


closed subspace of H containing A. 


. Let (x,,) and (y,,) be sequences in an inner product space. Prove that 


(a) if lim, x, = 0, and (y,,) is bounded, than lim,,(x,,y,) = 0; and 
(b) if yx, for each n EN, and ))™, x, is convergent, then yl), x,. 


. Prove that if an element x in a Hilbert space is orthogonal to every vector 


in a dense subset of H, then x = 0. 
Let (x,,) be a sequence of mutually orthogonal vectors in a Hilbert space H. 
Prove that as X, converges in H if and only if yy [|Xn||? < 00. 
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11. Use Hilbert space methods to provide an easy proof of the Hahn-Banach 
theorem for Hilbert spaces. 

12. Let M= {x = (%},...,X,) € R” : pie? = 1}. Show that M is closed and 
convex, and find the element in M closest to the origin. 

13. Let C be a closed convex subset of a Hilbert space H, let x € H—C, and 
let y be the closest element of C to x. Prove that, for every z € C, Re(x—y, 
z—y) <0. 

14. Let 6,, be a positive sequence, and let C={x EP : |x,| <6,}. Show that C 
is compact if and only if )y""_, 52 < ov. 


7.2 Orthonormal Bases and Fourier Series 


In the introduction to section 7.1, we made the case for the existence of a maximal 
orthonormal sequence {u,,u,...} in a Hilbert space H. As you will see in this 
section, some Hilbert spaces do not admit countable maximal orthonormal 
subsets. Perhaps we must first tackle the problem of the existence of a maximal 
orthogonal subset of H, then examine the problem of which Hilbert spaces possess 
a countable such subset. In this section, we provide solutions to both problems 
and reveal the basic structure of a Hilbert space, hence paving the way to answer 
the problems posed in section 4.10. 


The proof of the following theorem can be seen in section 3.7. 
Theorem 7.2.1. An orthogonal subset S of a Hilbert space H is independent. @ 


Definition. An orthonormal basis for a Hilbert space H is a maximal orthonor- 
mal subset of H. An orthonormal subset of H is maximal if it is not properly 
contained in another orthonormal subset of H. 


Example 1. We show that {e,, : n € N} is an orthonormal basis for [’. 
It is clear that S is orthonormal. If x = (x,,) € ? is orthogonal to S, then, for every 
neéEN, x, = (x,e,) = 0, and hence x = 0. @ 


In the theorem below, we prove a little more than the existence of an orthonormal 
basis for an arbitrary Hilbert space. 


Theorem 7.2.2. Every orthonormal subset A of a Hilbert space H is contained 
in an orthonormal basis for H. In particular, every Hilbert space contains an 
orthonormal basis. 
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Proof. Let B be the collection of all orthonormal subsets of H that contain A; B is 
not empty since A is one of its members. Order B by set inclusion. It is rather 
straightforward to show that the union of the members of a chain in B is an 
orthonormal subset of H that contains A and is therefore an upper bound of the 
chain. By Zorn’s lemma, 8 has a maximal member, that is, an orthonormal basis 
of H containing A. To prove that every Hilbert space possesses an orthonormal 
basis, apply the result we just proved with A = {x}, where x is a unit vector. Mi 


The goal of this section is to represent an arbitrary element of a Hilbert space 
H in terms of a basis of some kind. If dim(H) < oo, the goal is too trivial, and 
if dim(H) = oo, the goal is unrealistic if one insists on looking at a Hamel basis 
because any such basis is uncountable and hence too big to be useful. The only 
realistic expectation is to hope to express an arbitrary element of H as a series 
of the basis elements, as was achieved in section 4.10 for trigonometric series of 
continuous functions. This means that H has a Schauder basis, which immediately 
suggests that we investigate separable Hilbert spaces (see problem 12 on section 
6.1). The following theorem is the happy coincidence we hope for. 


Theorem 7.2.3. A Hilbert space H is separable if and only if every orthonormal basis 
of H is countable. 


Proof. If H is separable, then H contains a countable dense subset {x,,Xp,...} and, 
clearly, H = UnenB(x,, 1/2). If S= {ugh ger is an orthonormal basis for H, then, 
for a,B EI, ||Ug —ugl| = 4/2. Since the diameter of each of the balls B(x, 1/2) 
is 1, no such ball can contain more that one member of S. Therefore S is at most 
countable. 

Conversely, if H possesses a countable orthonormal basis S = {u, : n € N}, 
let A be the collection of all finite linear combinations of element in S with 
coefficients in Q+ iQ. We claim that A is dense in H. This will conclude the 
proof because A is countable. To prove the claim, let M be the closure of A. To 
show that M is a subspace of H, let x,y € M, and let a,b € K. Then there exist 
sequences (x,) and (y,) in A, and sequences a,,b, € Q + iQ such that lim, x, = 
x lim, ¥, =y,lim,a, =a, and lim, b, = b. The sequence (a,X,+ bnyn) is in 
A, and lim,, a,x, + b,y, = ax + by. Therefore ax + by € M. We now show that 
M = H. Ifnot, then H = M @ M-, and M* & {0}. Pick a unit vector z€ M+. Then 
SU {z} is an orthonormal subset of H that properly contains S. This contradicts the 
maximality of S and completes the proof. 


Example 2. It is possible for a separable inner product space (hence for a sepa- 
rable Hilbert space) to contain uncountably many pairs of orthogonal vectors. 
Consider the space C[—7,7] with the inner product (f,g) = = fe fxdgd; 
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C[—7, 7] is separable.” In the notation of example 4 on section 7.1, every pair 
of functions (f, g) € M x N is orthogonal. Since both M and N are uncountable, 
we have proved our assertion. 


We focus mostly but not exclusively on separable Hilbert spaces. The existence 
of inseparable Hilbert spaces of arbitrary Hilbert dimension will be presented in 
the excursion at the end of this section. Many of the results we develop in this 
chapter are valid for inseparable Hilbert spaces. Examples include the projection 
theorem, the Riesz representation theorem, and the next three theorems. Also, in 
the definition below, the set J need not be countable; hence H is not assumed to be 
separable. 


Definition. Let S = {ug : a € I} be an orthonormal subset of a Hilbert space H. It 
is not assumed that S is an orthonormal basis. For an element x € H, the scalars 
Xq = (x,Uq) are called the Fourier coefficients of x relative to S. 


Theorem 7.2.4. Let S={u,,...,u,} be an orthonormal subset of H, and let 
x € Span(S). Then x = Dy, Xu; and ||x||? = Dei. 


Proof. See the proof of theorem 3.7.5. 


Theorem 7.2.5. Suppose S = {uy,...,u,,} be an orthonormal subset of H, and let 
M = Span(S). For a vector x € H, the vector y= ni tie is the orthogonal 
projection of x on M. In particular, for all scalars a,,...,ay, ||x — Die Sl < 

n n a, 12 
\|x— 3), iu ||. Furthermore >) ._, |X? < ||xl|’- 

Proof. We only need to show that the vector z= x —y=x— ae %iu; is in M+. The 
rest of the assertions follow from the projection theorem and theorem 7.2.4. Now, 
forafixed1 <j<n, 


(z, uj) =x, uj) — Dy eduir) = (x, uj) —X,=0. 


Theorem 7.2.6 (Bessel’s inequality). Let {u,,} be an orthonormal subset of a Hilbert 
space H. Then, for x € H, yy, |X|? < |[e1I?- 


Proof. By theorem 7.2.5, Yy_, |x|? <|lx||? for each n EN. Taking the limit as 
n — oo yields Bessel’ inequality. 


> The set of trigonometric polynomials with rational coefficients is dense in C[—77, 71]. See corollary 
4.10.3. 
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Theorem 7.2.7. Let H be a separable Hilbert space, and let S = {u,, : n € N} be an 
orthonormal subset of H. Then the following are equivalent: 


(a) Sis an orthonormal basis for H. 
(b) For everyx € H, x= ae XUn 

(c) Span(S) is dense in H. 

(d) For every x € H,, ||x|? =~, [?. 
(e) Parseval’s identity. For every x,y € H, (x,y) = re (Bn >. 


n 
Proof. (a) implies (b). Let y,, = oe 
Forn<m, lm — Yul? = | Dove hatielle = paieene ||? > 0asm,n— ow, 
because ye |X,,|7 < co (Bessel’s inequality). This shows that (y,) is a Cauchy 
sequence in H; hence it converges to, say y. Thus y = bone X,Uy,. We need to show 
that y =x. For a fixed kEN, (y, up) = limy.o(¥n, Up) = (Xx, Uy). Thus y—x is 


orthogonal to each uy. Ify —x #0, then SU eae } would be an orthonormal set 
y—x| 


that properly contains S. The maximality of S forces y = x. 


That (b) implies (c) is obvious, since x = lim,_,..5 pew X,uUz, and each pee Kup 
is in Span(S). 


(c) implies (d). Suppose, for some x € H, ee See and let 6° = 
IIx? - De, nl? We show that the ball B(x,5) contains no finite linear 
combination of S. This will show that Span(S) is not dense in H. 1 au E 
Span(S), then, by theorem 7.2.5, ||x— Ze 2 Me Da a? = 
el? 1D, Semel? = Ie? — DL, el? = 


(d) implies (e). The identity ||x\? =), |,/° can be written as ||x||? = ||&||3, 
where X = (X,) € P, and ||X||2 is the ?-norm of x. Now, assuming (d) is true, then, 
for every aEK, (xtay,xt+ay)=(K+ap,x+ap). Equivalently, a(y,x)+ 
x,y) = A,X) + A(X, f). Setting a =1/2, we obtain Re((x,y)) = Re({x,y)). 
Setting a = 1/2i yields Im(x, ue Im(X, 9). This proves that (x,y) = (X, 9), which 
is equivalent to (x,y) =), X,/,- 


(e) implies (a). Suppose there exists a unit vector u such that SU {u} is orthonor- 
mal. Then iy, = (u,u;,) = 0 for all k EN, and 1 = (u,u) = ae i,t, = 0. This 
contradiction shows that (a) is true. 


Example 3. Every element in P can be written as a series x = > Xnen- 


n=1 
Consider the vectors Yn = *— Di Mie = (0,0,...,0,Xp415Xn425+--). Since 


: . ce 
lim, lly nll? = lim,, Des |x|? =0,x= ee 1*n en + 
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Example 4. Consider the countable set of functions S = {e’"" : n € Z}. By corol- 
lary 4.10.3, Span(S) is dense in (C[—z,7],||.||,). Therefore Span(S) is dense 
in the completion, H, of (€[—7,7],||.||,). Thus the set {e' : n € Z} is an 
orthonormal basis for H. We will see in section 8.9 that H is the space L(—7,7:) 
of (Lebesgue) square integrable functions on (—7, 7z). 

For exactly the same reason, (see theorem 4.10.8), the set of normalized 
Legendre polynomials {P,, : n € N} is an orthonormal basis for 27(—1, 1). @ 


Definition. Two Hilbert spaces H, and H, are isomorphic (as Hilbert spaces) if 
there exists an isomorphism T : H, > H, such that, for all x,y € H,, 


(x,y) = (Tx, Ty). 
It follows directly that such an isomorphism is also an isometry because 
lI? = (e.x) = (Tx, Tx) = || Tel. 
Theorem 7.2.8 (the Riesz-Fisher theorem). Let H be a separable Hilbert space. 


(a) If dim(H) =n, then H is isomorphic to K". 
(b) If dim(H) = ov, then H is isomorphic to P. 


Proof. We only prove the second statement. The proof of the first statement is simpler. 
Let {u,} be an orthonormal basis for H. For x € H, let T(x) = (%,)%) =% 
T: HP is linear since (ax + by) = ax + bj. The fact that (x,y) = (Tx, Ty) is 
Parseval’s identity in theorem 7.2.7. To verify that T is one-to-one, suppose that 
X= y. Then (x —yf=0, and pe [Xn —Jnl? =0. Therefore &, =, Hence, by 
theorem 7.2.7, x = ike es = pe = y; T is onto because if (a,) EP, 
then the series a a,U, converges to a vector x © H such that x =(a,). See 
problem 3 at the end of this section. 


We offer a few observations on some crucial differences between Banach and 
Hilbert spaces. This will hopefully explain why Hilbert spaces have such an elegant 
and uncluttered structure compared to a general Banach space. 


The closest point property and the projection theorem (theorems 7.1.5 and 7.1.7, 
respectively) are at the heart of the constructions of this chapter. An examination 
of the proof of theorem 7.1.5 reveals that the parallelogram law delivers both 
the existence and the uniqueness of the closest point to a closed convex set. The 
parallelogram law is a direct result of the fact that the norm on a Hilbert space is 
induced by an inner product, which is what sets Hilbert spaces apart from general 
Banach spaces, where the closest point property fails as does the conclusion of the 
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projection theorem. The following simple example illustrates one of the discussion 
points. 


Example 5. Consider the space R” with the norm ||x||,, = maxf{|x,|, |x2|}. The set 
M = {x = (x1,x)) € R? : |x,| < 1, |x| < 1} is closed and convex. Every point on 
the line segment {(1,y) : |y| < 1} has distance 1 from the point x = (2,0), and 
dist(x,M) = 1. There are examples where the very existence of a closest point is 
not guaranteed. See problems 6-8 at the end of this section for a slight expansion 
of this discussion. 


It was mentioned in section 6.4 that not every closed subspace of a Banach space is 
complemented. Theorem 7.1.7 guarantees that every closed subspace of a Hilbert 
space is complemented. Projections in Banach spaces play a similar role to orthog- 
onal projections in proving that certain closed subspaces are complemented. See 
problem 10 on section 6.4 for necessary and sufficient conditions for a closed 
subspace of a Banach space to be complemented. Also examine example 6 in 
section 6.4. 


Excursion: Inseparable Hilbert Spaces 


Inseparable Hilbert spaces do exist. They are mostly a curiosity and do not have 
much practical use. We include the discussion below for the satisfaction of the 
inquisitive reader. 

The motivation for the definition below and the construction in theorem 7.2.9 
is provided by the following example. 


Example 6. Let S= {ug : a € I} be an uncountable orthonormal subset of a 
Hilbert space H. For a vector x € H, consider the set of Fourier coefficients 
{Xq 1 a € It. We claim that %, = 0 for all but countably many @ € I. 

Let {ug,,-+,Uq,} be a finite subset of S. By theorem 7.2.5, pa [eal < 
||x||? < 00. It follows that >), |Zal? < 00 (see example 1 in section 4.10 and 
the definition preceding it); hence the set {a EI : X¢ #0} is countable. ¢ 


The above example strongly suggests the following definition. 


Definition. Let I be an infinite set, and let 8 = Card(I). Define ?() to be the 
set of all functions x : I— C such that xy = 0 for all but countably many a € I 
and ||x|| = (1 ¢e7|¥al”)"? < 00. To eliminate any danger of ambiguity, let I, = 
{1,@2,...} be the subset of I for which x, # 0. The notation >), |x,,|? means 
ae |xq,|?. We will continue to employ this notation for the remainder of this 
discussion. 
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Theorem 7.2.9. The set I?() is a Hilbert space with the operations defined within 
the proof. 


Proof. Let x =(xq), and y= (vq) € P(X). We show that x+y € P(X) and that 
|x + yl| < |x| + lvl]. Let 1, ={a ET: xq FO}, and 1,={a E1: yq #0}, and 
let J = I, UI,. Since Jis countable, we can write J = {a,,,...}. Note that X = (xq,) 
and J =(yq,) are in P; hence ||&+3||2 < ||&llo + |[Pll2- But every a for which 
Xa + Yaa #0 is in J; hence ||x+ yl] = (2, bea, + Yeu 2)" = 8+ Ile S Ulelle + 
Plo = ||xl| + |ly||. The fact that ||ax|| = |al||x|| for all x € P(X) and all scalars a 
requires an even simpler argument. The rest of the properties of a normed linear 
space are easily verifiable. Thus ?() is a normed linear space. 

Define an inner product on P(X) as follows: (x,y) = (X,3) = ae 
inner product induces the norm on P() we defined earlier. We now show 
the completeness of -(&). Suppose (x) is a Cauchy sequence in P(), let 
n=iael: Pi #0}, and let J=U,enl,. Then J is a countable subset of I, 
and we can write J = {ct,0,...}. Since || — 9 || = | — yO |], &) is a 
Cauchy sequence in P and is therefore convergent to an element X=(x1,Xp,...)€ P. 
Define xE P(X) by 


XaV ep, This 


Xj if o = a, 
x= 
0 otherwise. 


Clearly, x“ converges to x in P(). i 


The reader can now anticipate the theorem that must be stated next: the set {eg wey 
is an orthonormal basis for ?(&), where e,(8) = 5q,g- Thus, for any cardinal 
number &, we have constructed a Hilbert space whose orthonormal basis has 
cardinality &. Such a space is also unique up to Hilbert space isomorphism in 
the sense that it depends only on X and not on the particular set J in the above 
construction. We leave it to the interested reader to reflect on the details. 

The cardinality of an orthonormal basis of a Hilbert space H is known as the 
Hilbert dimension of H. 


Exercises 


1. Let {u,,} be an orthonormal basis for a separable Hilbert space H, and let 
{v,,} be an orthonormal set in H such that yas ||u, — V,||° <1. Prove that 
{v,,} is an orthonormal basis for H. 

2. Let S = {v,,v>,...} be an orthonormal subset of a separable Hilbert space H 


(not necessarily an orthonormal basis), and let M = Span(S). Prove that if 
P is the projection of H onto M, then Px = Denes V;)Vj. 


3. 


4. 


5. 


6. 


13. 


14 
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Let {u,,} be an orthonormal basis for a separable Hilbert space H, and let 
(a,,) € P. Prove that there is an element x € H such that %,, = ay. 

Use theorems 6.2.4 and 7.2.8 to provide an alternative proof of the Riesz 
representation theorem for separable Hilbert spaces. 

Let {u,,} be an orthonormal basis for a separable Hilbert space H. Define a 
function ||.||’ : H > R as follows: ||x||’ = ye Bal: Show that ||.||' is a 
norm on H and that it is not equivalent to the original norm on H. 

Let X = @[0,1] endowed with the uniform norm, and let M be the subset 
of X consisting of all functions f such that f0)=0,f(1) = 1,f20, and 
f, Jix)dx = 1. Prove that M is closed and convex and that dist(0, M) = 1. 
Also show that, for every f € M, |lfl|.. > 1,and hence M contains no element 
of smallest norm. 


Definition. A Banach space X is strictly convex if whenever x # y, and 

x+y 7 5 4 : 
\|x|| = |ly||, then || = || < ||x||. Geometrically, strict convexity means that if 
x and y are equidistant from the origin, then the midpoint is strictly closer 
to the origin than «x (and y). 


. Prove that a Hilbert space is strictly convex. 
. Let X bea strictly convex Banach space, let M be a closed convex subset, and 


let x € X. Show that if there is a point y € M such that ||x — y|| = dist(x, M), 
then y is unique. 


Definition. A sequence (x,,) in a Hilbert space is said to converge weakly 
to x EH if, for every y € H, lim, (x,,y) = (x,y). In light of the Riesz rep- 
resentation theorem, this definition is consistent with the corresponding 
definition for Banach spaces introduced in the problem set in section 6.4. 
Some of the exercises below are repetitive of problems on section 6.4, but 
the proofs can be significantly simplified in the context of Hilbert spaces. 


. Prove that if {u,,} is an orthonormal basis for a separable Hilbert space, then 


(u,) converges weakly to 0. 


. Show that a norm convergent sequence is weakly convergent but not 


conversely. 


. Show that the weak limit of a sequence in a Hilbert space, if it exists, is 


unique. 


. Show that if lim,(x,,y) = (x,y) for every y in a dense subset of H, then 


X, 7” x. 

Let {u,,} be an orthonormal basis for a separable Hilbert space H. Prove the 
x, >” x if and only if lim,(x,,u4;) = (x, u;) for every j EN. 

Show that ifx,, >” x, and lim, ||x,,|| = ||x||, then x, is norm convergent to x. 
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15. Prove that ifx,, >" x, then {||x,,||} is bounded and ||x|| < lim inf, ||x,,||. Hint: 
The linear functionals 1,,(€) = (&,x,) are pointwise bounded. 

16. Let x, >” x. Prove that there exists a subsequence x,, of (x,) such that 
= at is strongly convergent to x. Hint: Without loss of generality, 
assume that x = 0. Inductively define a subsequence x,, of (x,) such that 
XigaXaols 2-* fori =1,...,k-—1. Now 


1 N l N l N k-l 
Wp Dy Mnull? = 55 Dy Uengll? + oy Dy Dy ARE p Xing 
k=1 k=1 k=2i=1 


This is a version of the Banach-Saks theorem. 


7.3 Self-Adjoint Operators 


This section establishes the broad characteristics of self-adjoint operators. We also 
study projection operators and prove a theorem (7.3.11) which produces ample 
examples of self-adjoint operators. Self-adjoitness (more generally, normality) is 
not nearly sufficient to produce a result resembling the spectral theorem for nor- 
mal operators on finite-dimensional inner product spaces. The complete picture 
emerges in the next section. 

In the finite-dimensional case, the definition of the adjoint is straightforward, 
owing to the simplicity of the characterization of linear functional on a finite- 
dimensional inner product spaces (see example 9 in section 3.7). The definition 
below in the infinite-dimensional case requires the full power of the Riesz repre- 
sentation theorem. Let H be a separable Hilbert space, and let T € L(H). For a 
fixed element y € H, define a functional A, by A,(x) = (Tx,y). It is clear that A, is 
linear. In fact, A, is bounded since |A,(x)| = |(Tx,y)| < ||Txl[llyll < || TIlllllllyll- By 
the Riesz representation theorem, there exists a unique element Ty € H such that, 
for all x € H, A,(x) = (x, T’y). We therefore have a function T* : H > H defined 
by the requirement that 


(Tx, y) = (x, T*y) 
for all x,y € H. 


The above equation is the defining property of the adjoint operator T* of T. The 
reader can easily see that the definition is consistent with the definition of the 
adjoint operator on a Banach space that was introduced in section 6.6. 


HILBERT SPACES 309 


It is easy to verify that T* is linear. For example, 


(x, TQ + y)) = (Tx, y + y) = (Tx, 1) + (Tx, ¥2) 
= (x, T*y,) +(x, T*y) = (x, T*y, + T*y,). 


This shows that T*(y, + y.) = T*(y,) + T*(2). We now show that T* € £(H). 
I|T*yl? = (T*y, T'y) = (T(T*y), y) $ ITC yIllly ll S ITI T*yIlllyll- Thus ||T*y|| < 
| T|I||y|| for every y € H. Hence T* € L(A), and ||T*|| < ||T|. 


Theorem 7.3.1. For T,T,,T, € £(H) anda €K, 


(a) (T,+T,)* =T[+T. 
(b) (aT)* =aT*. 

(c) (T,T,)* = T;T;. Consequently, for every n EN, (T*)" =(1")*. 
(d) T™* =T. 

(e) |IT*l| = IIT 

(f ITT = ITIP. 


Proof. The computations needed to prove parts (a)-(d) are simple. As an example, we 
establish part (c): (T, Tx, y) = (Tx, T;y) (x, T; T;y), which, by definition, means 


22s 


that (T,T,)* = T,T;. We already saw that ||T*|| < ||T||. Applying the same fact to 
T* and using part (d), we have ||T*|| < ||T**|| = ||T||, thus proving (e). To prove 
(PITTI <IIT"IITH = |T1P. Also, 

|| Tx]? = (Tx, Tx) = (x, T* Tx) < |x|||] T° Teel] < [lee lI|T* Tle ll = |T*TIlilel, 
which implies that \|T||? < ||T*T||, and the proof of (f) is complete. 


Definition. An operator T € £(H) is called self-adjoint if T* = T. Thus T is self- 
adjoint if and only if, for all x, y © H, (Tx,y) = (x, Ty). 


Example 1. Let {u,, : 1 € N}bean orthonormal basis for a separable Hilbert space 


H, and fix a positive integer N. The projection operator P : H > H defined by 
Px = So XyUy is self-adjoint. For all vectors x and y in H, 


(Px,y) = (YS =D, Sp 


while 


N N = N = 
(Py) = 2" Fattn) = De Fyn) = Yo Sadye @ 
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Projection operators are the simplest self-adjoint operators, and, in a way, are 
building blocks we can use to generate more examples of self-adjoint operators. 


Theorem 7.3.2. (a) The sum of self-adjoint operators is self-adjoint. 
(b) If T is self-adjoint and a € R, then aT is self-adjoint. 
(c) The composition of two self-adjoint operators T, and T, is self-adjoint if and 
only if T,T, = TT. 
(d) The set of self-adjoint operators is closed in £(H). 


Proof. We leave the proof of (a)-(c) to the reader. To prove (d), let (T,,) be a sequence 
of self-adjoint operators that converges in £(H) to T. We show that T* = T. Then 


IT — TI] < T— Toll + IT, — THll +78 — Tl 
= ||T—Tyll + [I(C,— T)"ll = 2IT— Tl > 04s n > 00. Ml 


Theorem 7.3.3. The eigenvalues of a self-adjoint operator T are real. 


Proof. Let A be an eigenvalue of T with eigenvector u. A(u,u) = (Au, u) = (Tu, u) = 
(u, Tu) = (u, Au) = Au, u). Since (u,u) #0,A =A, and A is real. @ 


Theorem 7.3.4. If T is a self-adjoint operator, then eigenvectors of T corresponding 
to distinct eigenvalues are orthogonal. 


Proof. Let Tu; =A,uy,Tuy =Aguy, where u,)#OFU,, and A, #A,. Then 
A (Uy, Uz) = (Ay uy, Up) = (Tuy, Uz) = (uy, Tuy) = (uy,Apuy) = Az(u,, Uz). Thus 
(A, —Ag){u), U2) = 0, and (u,,uz) = 0. 


Example 2. The set of eigenvalues of a self-adjoint operator on a separable Hilbert 
space is at most countable. 
Since a separable Hilbert space cannot contain an uncountable subset of 
orthogonal vectors and since eigenvectors corresponding to distinct eigenval- 
ues are orthogonal, the set of eigenvalues is at most countable. 


Lemma 7.3.5. (a) Let H be a complex Hilbert space, and let T € £(H). If(Tx,x) = 
0 for all x € H, then T=0. 
(b) Let H be areal Hilbert space, and let T € £(H) be self-adjoint. If (Tx,x) =0 
for all x € H, then T =0. 


Proof. It is sufficient to show that (Tx,y) =0 for all x,y © H, because, in that 
case, (Tx, Tx) = 0, so ||Tx||? = 0, and T = 0. For x,y € H, and scalars a and B, 
0=(T(ax+t By),ax+ By) = aB{Tx, y) + aB(Ty,x). If we take a = 8 =1, then 
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(Tx, y) + (Ty, x) =0. If we take a =i,B =1, then i(Tx,y) —i(Ty,x) = 0. The 
above two identities imply that (Tx,y) = 0. 
‘The proof of part (b) is a straightforward specialization of the proof of (a). 1 


Remark. Part (b) of the above theorem is false if T is not self-adjoint. For example, 
if T : R? > R? is the 90° rotation of the plane, then (Tx,x) = 0 for all x € R?. 


Theorem 7.3.6. Let H be a complex Hilbert space, and let T € £(H). Then T is self- 
adjoint if and only if (Tx, x) is real for all x € H. 


Proof. If T is self-adjoint, then (Tx,x) = (x, T*x) = (x, Tx) = (Tx,x). Thus (Tx, x) is 
real. Conversely, if (Tx, x) is real for all x € H, then (Tx,x) = (Tx,x) = (x, T*x) = 
(T*x,x). Thus ((T* — T)x,x) =0 for all x; hence T* —T=0, by the previous 
lemma. @ 


Theorem 7.3.7. Let T € £(H) be self-adjoint. Then 
I| TI] = sup{|¢Tx, x) [lx] = 1p. 
Proof. Let M = sup{|(Tx,x)| * ||x|| = 1}. If ||x|| = 1, then 
(Tx,x)] < ITI Ill]? = IIT. 
Thus M < ||T\j. 
It follows from the definition of M that |(Tx,x)| < M||x||? for all x € H. The 


following identities are easy to verify: 


(Tx + y), x+y) —(T(x— y),x — y) = 2(Tx, y) + 2(Ty, x) = 2(Tx, y) + 2y, Tx) 
= 2(Tx, y) + 2(Tx, y) = 4Re((Tx, y)). 


Thus 


[Re(Tx,y)] < GT + 9), x+y) + GMTx—y),x—y) 


M M 
< Fille + yll? + [le — yl? = Allo? + IAP. 
The summary of the above calculations is that 


M 
for all x,y € H,|Re(Tx, y)| < ={llxll? + lyl|?3 (3) 
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If ||x|| = 1, and Tx # 0, lety= a Then 
Tx 1 
Re{ Tx, in) = —— (Tx, Tx) = ||Tx||- 
( TxI|/ || Tah 


This and inequality (3) yield 


M Tx 
< = 4IIx||? re 
Wal < fll? + et = a0 


This completes the proof. 
Theorem 7.3.8. If T is a self-adjoint operator, then r(T) = ||T|I. 


Proof. Since |A| < ||T|| for all A € o(T), it is sufficient to find an element A € o(T) 
such that |A| = ||T||. By the previous theorem, there exists a sequence of unit 
vectors (x,,) such that lim, |(Tx,,*,)| = ||T||. Thus there exists a subsequence (Vn) 
of (x,) such that lim, (Ty,,¥,) = ||T||, or lim, (Ty,.¥,) = —||T||. Therefore there 
exists a real number A such that |A| = ||T|| and lim,(Ty,,y,) =A. Now 


T¥n — AX all? = [|TV all? — 2A TY Yn) FA [lyall? S ITI? — 2A TV nn) +? 
= 27”? =e 2UTY ns Vn) = 22(A — (TY ns Vn) > 0. 


If TAI is invertible, then 1=|ly,||=||\(T-AD7'(T-ADy,|l < | 
(T-—AD7!||||Ty, —Ay,|| > 0. This contradiction shows that A € o(T). @ 


Definition. A bounded operator P € £(H) is a projection if, for some closed 
subspace M of H, P is the (orthogonal) projection of H onto M. See the 
projection theorem. We remind the reader that we use the notation P,, to denote 
the projection of H onto its closed subspace M 


Theorem 7.3.9. A bounded operator P is a projection if and only if it is idempotent 
and self-adjoint. 


Proof. Suppose P is the projection of H onto a closed subspace M. The fact that P* = P 
has been established in theorem 7.1.8, We show that P is self-adjoint. First observe 
that, for all x,y € H, Px € M and Py—y € M'; hence (Px, Py—y) = 0. 

Now for x,y € H, 


(Px, y) = (Px, y — Py) + (Px, Py) = (Px, Py) = (Px — x, Py) + (x, Py) = (x, Py). 
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Conversely, suppose that P is self-adjoint and idempotent. Let M={x € 
H: Px=x}. Being the kernel of the bounded operator P — I, M is a closed subspace 
of H. We show that P is the projection of H onto M by showing that, for x € H, 
y=PxeM, and z= x — Px € Mt. Because Py = P(Px) =Px=Px=y,yeM. 
Now if w & M, then (z,w) = (x — Px, w) = (x, w) — (Px, w) = (x, w) — (x, P*w) = 
(x, w) — (x, Pw) = (x, w) — (x, w) = 0. 


Definition. Two projections P, and Py are said to be orthogonal if M L N. Notice 
that, in this case, M+N=MON. 


Theorem 7.3.10. The sum of two orthogonal projections Py and Py is a projection 
and, in this case, Py+Py=Pygy. Consequently, the sum of a finite set of 
pairwise orthogonal projections is a projection. 


Proof. It is easy to verify that PyPy = PyPy=0. For example, Py(Pyx) = 0 
because Pyx € N, and NC M+. Now theorem 7.3.9 implies that Py+ Py is 
a projection since the sum of self-adjoint operators is self-adjoint and (Py + 
Py) = Py, + PuPn + PyPu + Pe, = Pi + Py =Pu t+ Py. We now show 
that R(Pyt+Py)=MON. Clearly, R(PytPy) CMON. Conversely, if 
x=y+zEMON, where yEM, and zEN, then (PytPy)(x) = Pyly) + 
Py(y) + Py(z) + Pr(z) = Py(y) + Pye) = y+z2=x. 0 


Example 3. Let M and N be the following closed subspaces of ?: M= 
Span({e,, : n € N}) and N= Span({eo,4, : n © N}). Since M@N=P, Py + 
Py = L. 4 


The following construction produces an abundance of examples of self-adjoint 
operators. This is also the first step toward understanding the structure of compact 
self-adjoint operators. 


Theorem 7.3.11. Let (P,,) be a sequence of pairwise orthogonal projections, and 
let 2, be a sequence of nonzero complex numbers such that lim,,A,, = 0. Define 
T:H>HbyTx= pine ke Then 


(a) Tisa bounded operator; 
(b) T* = YAP therefore if each A, € R, then T is self-adjoint; and 
(c) {Ay} is the set of nonzero eigenvalues of T. 


Proof. We show that the sequence of operators S,, = eS A,P; is a Cauchy sequence 
. . m . . . 
in £(H). ie previous theorem shows that, for n <m, >), _,, Py is a projection; 
hence || >) ,_,, Px|| = 1. (See theorem 7.1.8.) Observe that the mutual orthogonality 
of the projections P,, implies that, for every x € H, the vectors P,,x are orthogonal. 
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Let €>0. Since A, > 0, there exists a positive integer N such that, for all 

n>N,|A,| <¢. Now, form > n> N, and an arbitrary x € H, || Den AePex ||? 7 
m m m m 

Dien al Pell? Se? Dan IPAM? = 7M pen Pell? S €7U Den Pall Mall? = 

€||x||?. This shows that the sequence S,, of partial sums of the series defining T is 

Cauchy; hence the series converges, and T € £(H). This proves part (a). 


Now (Tx,y)=(Yipa 1 AnPn%Y)= Lp APH Y)= Lyn Ane Pry =O Dp ms 
A,P yy); hence we obtain the stated formula for T*. 


Fix a positive integer m. For a nonzero element x € M,, = R(P,,), Px = x, and 
P,,x = 0 for alln #4 m. Thus Tx = eee Ay PyX = AmPimnX = Amx. Hence Ayn is an 
eigenvalue of T. To show that {A,} is the entire set of nonzero eigenvalues of 
T, letu€ Hand0#mMeEC be such that uF# A,, and Tu = pu. Since u= Te 


u € R(T). We show below that u € R(T); hence u = 0, which will establish (c). 
For x € M,, Tx =A,x,T°x =1,x, so 


Mu, Tx) = (fu, Tx) = (Tu,A,x) = (u, T*(A,x)) 
= (u,ApAyX) =A/,(u,A,,x) =A,u, Tx). 


Thus (u —A,,){u, Tx) = 0. Since U# A, (u, Tx) = 0. We have shown that u © Mi 
foreveryn EN. Therefore u L S = SpantUnenM,} hence u L S. Clearly, R(T) 
S; hence u € R(T). a 


foe} 


Example 4. Consider the operator T : ? > P defined by Tx = )) 


sa ie Here 
P,, is the projection of ? on the one-dimensional subspace spanned by é,. 
By the above theorem, T is self-adjoint, and the set {A, = -:ne N} is the 
entire set of nonzero eigenvalues. Since the spectrum of T is closed, A = 0 is 
in o(T). However, since T is injective, 2 =0 is not an eigenvalue of T. We 
now show that the set $= {- : nS N}U{O} is the entire spectrum of T. If 
A€EC—S, then 6 = dist(A,S) > 0. Now (T—ADx = pean (- —A)Kne,3 hence 
(TANI? = DE [+ APE? > DR [Ey !? = Sfx /2. Thus T— Aris 
bounded away from zero. In the same manner, the adjoint of T— AI, namely, 
T- AL is bounded away from zero. Hence T is invertible by problem 11 at the 
end of this section. 


Theorem 7.3.12. Let T € L(H). Then 


(a) R(T) = N(T*), and 
(b) N(T*)* = R(T). 
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Proof. y€ R(T) if and only if (y, Tx) = 0 for all x € H if and only if (T*y,x) =0 
for all x € H if and only if T*y = 0 ifand only ify € N(T*). Part (b) follows from 
(a) and N(T*)+ = R(T) = R(T). & 


The above theorem has many applications. Here is a simple example. 


Example 5. Suppose T is a self-adjoint operator. If/ is not an eigenvalue of T, then 
the range of T— AI is dense. 
IfA is not an eigenvalue of T, Ais notan eigenvalue of T. Applying the previous 
theorem, we have R(T — AD = N(T* —AD! = N(T-AD+ = {0} =H. 


The following example shows that the entire spectrum, not just the eigenvalues, 
of a self-adjoint operator is contained in R. Problem 16 at the end of this section 
provides a sharper result. 


Example 6. Let Tbe a self-adjoint operator on H. Then o(T) CR. 

It is enough to show that if Ae C and yu = Im(A) £0, then A € p(T). Let 
x€H. Using theorem 7.3.6, we have 4l|x||? =—Im(((T—ADx,x)). Thus 
lulllxll? < (7 ADx, x) <||(T—ADai|llx||. Hence |x| [|x|] < ||(T— ADal|. This 
proves that T—AI is bounded away from zero. In particular, by example 8 in 
section 6.2, T — AI is injective, and its range is closed. By the previous example, 
R(T -—AlD = H. This shows that T— AI is invertible. ¢ 


The next example illustrates that theorem 7.3.11 is not the only way to construct 
self-adjoint operators and that we have wide control over the design of the 
spectrum. 


Example 7. Let {q,, : 1 © N} be an enumeration of the rational numbers in [0, 1], 
and let {u,, : n © N} be an orthonormal basis for a Hilbert space H. Define an 
operator T as follows: T(x) = ae AnX,U,. We show that o(T) = [0,1]. 

The assumptions of theorem 7.3.11 are clearly not satisfied. However, T 
is bounded because ||Tx||? = >”, gal&nl? SD, Ral? = [[xl!?- Thus || TI] < 1. 
The verification that T is self-adjoint is similar to example 1, and we leave it to 
the reader. Since Tu, = q,,U,, each q,, is an eigenvalue of T. Since o(T) is closed, 
[0,1] € o(T). By the above example and corollary 6.5.3 o(T) € [—1,1]. It is 
easy to see that if 2 © [—1,0), then |q,, —A| > |A| for every n EN. A calculation 
similar to that in example 4 shows that ||(T — ADx|| > |A|||x||, and hence T — Alis 
bounded away from zero. Since T — ATis self-adjoint, it is invertible by problem 
11 at the end of this section. 
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Normal and Unitary Operators 


We briefly discuss two classes of bounded operators. The section exercises extend 
the discussion. The definition of a normal operator is the same as that in the finite- 
dimensional case discussed in section 3.7. 


Definition. A bounded operator T on a Hilbert space is normal if TT* = T“T. 


Observe that every self-adjoint operator is normal and that, for an arbitrary 
operator T, TT“ and T*T are self-adjoint. 


Example 8. Let T be a normal operator. Then, for n € N, ||T”|| = ||T||”. 
We apply the result of problem 8 in the section exercises to the self-adjoint 
operator TT*: 


TWP" = ITI?) = eT = NWT)" = Ne = IP. 
Now equate the square roots of the extreme quantities of the last string. ¢ 


Example 9. A bounded operator Tis normal if and only if, for every x € H, ||Tx|| = 
|| T*x||. Consequently, N(T) = N(T*). 
If T is normal, then ||Tx||? — ||T*x||? = (Tx, Tx) — (T*x, T*x) = (x, T* Tx) — 
(x, TT*x) = (x,(T*T — TT*)x) = (x, 0) = 0. 
If || Tx|| = || T*x||, then, by the above calculation, (x,(T*T — TT*)x) = 0 for all 
x € H. Since T*T — TT* is self-adjoint, it is 0, by lemma 7.3.5. @ 


Example 10. Let 4 be an arbitrary complex number, and define T(x) = px. 
It is clear that T*x =x. By the previous example, T is normal because 
I ZI] = [ellleell = elle] = xl] = [T° @ 


Definition. A bounded operator U is unitary if UU* = U‘U =I. 


Observe that U is unitary if and only if U~' = U* and that every unitary operator 
is normal. 


Example 11. U is unitary if and only if || Ux|| = ||x|| for every x € H. 
If Uis unitary, then ||Ux||? = (Ux, Ux) = (x, U* Ux) = (x,x) = ||x||?. Conversely, 
if (Ux, Ux) = (x,x) for all x, then (x, U* Ux) — (x,x) =0 for all x € H; hence 
U*U= Iby lemma 7.3.5. 


Example 12. For 6 € [0,27:), the operator U(x) = ex is unitary, by the previous 
example. @ 
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The results of the last two examples are consistent with the fact the unitary matrices 
resemble rotations of the plane. Also see the last three problems in the section 
exercises. 


— 


NW WB W 


11. 


Exercises 


. Let T be a linear operator on H such that, for all x,y € H, 


(Tx,y) = (x, Ty). 


Prove that T is bounded. Hint: Use the closed graph theorem. 


. Let T€ L(A), and let S be an invertible operator. Show that T and S~!TS 


have the same eigenvalues. 


. Show that if Tis invertible, then (T~!)* = (T*)7}. 

. Prove that 2 € o(T) if and only if a € o(T*). 

. Complete the proof of theorem 7.3.1. 

. Complete the proof of theorem 7.3.2. 

. Let T bea linear operator on a complex Hilbert space H. Prove that, for all 


xy € A, 


(Tix t+y),x+y) — (T(x — y),x — y) + (T(x + iy), x + iy). 
— i T(x — iy),x — iy) = 4(Tx,y) 


For a real Hilbert space, prove that if T is self-adjoint, then 
(Tix+y), x+y) —(Tix—y),x— y) = 4(Tx,y). 


Use the above identities to provide another proof of lemma 7.3.5. 


. Let T be a self-adjoint operator on a separable Hilbert space H. 


(a) Prove that |||? = || 72||. By induction, ||T||?, = ||T?'|| for every positive 
integer k. 

(b) Prove that, for every positive integer n, ||T”|| = ||T||”. Hint: Choose an 
integer k such that 1 <n < 2'; || T\|? =||T? || = ||T"T?—"|]. 


. Prove that if x,, >” x, and T € £(H), then Tx, >” Tx. 
10. 


Let T € £(H). Prove there are unique self-adjoint operators A and B such 
that T= A+iBand T* = A—iB. 

Prove that a bounded operator on a Hilbert space is invertible if and only 
if both T and T* are bounded away from zero. Consequently, if T is self- 
adjoint, the mere assumption that T is bounded away from zero implies the 
invertibility of T. 
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12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 
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Let {u,} be an orthonormal basis for H, and let {A2,}@C be such 
that lim,a, =A and A, #4 for all n. Prove that the function Tx = 
y An(%,Un)U, is a bounded linear operator on H. Prove also that each 
A,, is an eigenvalue of T, but A is not. 

Let T € £(H). Show that if a subspace M is invariant under T, then Mt? is 
invariant under T™. 

Let Rand L be the right and left shift operators on I’, respectively. 

(a) Show that R* = L. 

(b) Describe the eigenvalues of each operator. 

(c) Prove that o(R) = o(L) = the closed unit disk in the complex plane. 
Let T be a self-adjoint operator on H, and let A be a complex number. Prove 
that A € o(T) if and only if inf), —1||(T —AD(@)|| = 0. Hint: If there exists 
a constant 6 > 0 such that ||(T—AD(x)|| = 4||x|| for every x € H, then, by 
example 8 in section 6.2, R(T—AlD) is closed. Show that it is also dense 
in H. To prove the converse, examine the proof of theorem 7.3.8. Observe 
that this result is false if T is not self-adjoint. The right shift on P satisfies 
||Re|| = [|x|] but 0 € o(R). 

Let T be a self-adjoint operator on a separable Hilbert space H, let 
m = infiq=1(Tx,x), and let M = supy,=\(Tx,x). Prove that o(T) C [m, M] 
and that both m and M are in o(T). Hint: Since o(T + wD = o(T) + p, we 
may assume (by considering T + jd for a sufficiently large positive constant 
4) that 0 < m < M. By theorem 7.3.7, ||T|| = M. Thus o(T) € [—M, M]. Let 
5 > 0, and let A = m — 6. For every unit vector u, ||Tu — Au|| > (Tu —Au,u). 
Show that (Tu —Au,u) > 6. By the previous problem, m — 6 ¢ o(T). This 
proves that o(T) C [m, M]. To show that M € o(T), use theorem 7.3.7 to 
find a sequence of unit vectors (u,,) such that lim, (Tu, u,) = M. Show that 
lim, Tu, — Mu, =0, and again use problem 15. To show that m € o(T), 
assume (by considering T— I for a sufficiently large positive constant j) 
that m < M <0. Apply the result you just obtained to the operator S = —T 
to conclude that —m € a(S). 

Let T be a self-adjoint operator on H, let M be a closed, T-invariant 
subspace of H, and let N= M?. If T, and T> are the restrictions of T to 
Mand N, respectively, prove that R(T) = R(T,) 6 RCT) and that o(T) = 
o(T,) Uo(T,). Hint: Use problem 15 to show that ifa ¢ o(T,) Uo(T,), then 
A€O(T). 

Prove that if P is the projection on a closed subspace M, then I— P is the 
projection of Hon Mt. 

Let P be a projection. Prove that 0 and 1 are the only eigenvalues of P. What 
is o(P)? 

Let P be a projection. Show that, for all x € H, (Px,x) = ||Px||’. 


21. 


22. 


23. 


24, 


25. 


26. 


QT: 


28. 


29. 


30. 


31. 


32. 
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Show that the composition P),;Py of two projections is a projection if and 
only if Py, and Py, commute. In this case, PyyPy = Pypn.- 


Definition. A bounded operator T € L(H) is positive if (Tx,x) > 0 for 
every x € H. If (Tx,x) >0 for all nonzero vectors x € H,T is said to be 
strictly positive. 


Prove that the eigenvalues of a positive operator are nonnegative. If T is 

strictly positive, prove that the eigenvalues of T are positive. 

Prove that a bounded operator T on a Hilbert space is invertible if there 

exists a positive constant c such that T — cI is a positive operator. 

Show that, for T € £(T), T*T is a positive operator. 

(a) Prove that if each ,, > 0, then the operator T defined in theorem 7.3.11 
is positive. 

(b) For a > 0, define T*(x) = yy AP): Prove that T*T® = T#+8 

Let S and T be commuting self-adjoint operators. Prove that the operator 

S+ aT is normal for every a € C. 

Let T be a normal operator. Prove that r(T) = ||T||. Conclude that if 

T #0, then o(T) contains at least one nonzero point. Hint: Use example 

8 and theorem 6.5.7. Observe that this result generalizes theorem 7.3.8 and 

provides an alternative proof of it. 

Let T be a normal operator. Prove that if A is an eigenvalue of T and u 

is a corresponding eigenvector, then A is an eigenvalue of T* and u is a 

corresponding eigenvector. 

Prove that eigenvectors of a normal operator corresponding to distinct 

eigenvalues are orthogonal. 

Prove that a bounded operator U is unitary if and only if (Ux, Uy) = (x,y) 

for all x,y € H. 

If Uis a unitary operator and {u,,} is an orthonormal basis for H, prove that 

{Uu,,} is an orthonormal basis for H. 

Prove that if A is an eigenvalue of a unitary operator, then |A| = 1. 


7.4 Compact Operators 


In section 3.7. we established the spectral theorem for normal operators on 
finite-dimensional inner product spaces. The question now is how much of the 
finite-dimensional theory can be generalized to self-adjoint (generally, normal) 
operators on a separable Hilbert space. 


Self-adjoint operators on an infinite-dimensional separable Hilbert space share 
some of the properties of Hermitian matrices. For example, the eigenvalues of 
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such an operator are real, and eigenvalues corresponding to distinct eigenvalues 
are orthogonal. However, the spectral theorem does not extend to self-adjoint 
operators for a simple reason: A self-adjoint operator on an infinite-dimensional 
separable Hilbert space may not have any eigenvalues. The following example 
assumes familiarity with the space 87(0, 1) of (Lebesgue) square integrable func- 
tions on [0, 1], with the inner product (f,g) = f fx)g(x)dx. The unfamiliar reader 
can think of 27(0, 1) as the completion of @[0, 1] with respect to the given inner 
product. See theorem 7.1.10 and example 4 in section 7.2. 


Example 1. The operator T on 27(0, 1) defined by (Tu)(x) = xu(x) is clearly self- 
adjoint and has no eigenvalues. 


In this section, we study compact operators in some depth. The culmination of the 
section is the spectral theorem for compact self-adjoint operators. 


Definition. A linear operator T on a separable Hilbert space H is compact if it 
maps bounded sets into relatively compact sets. Thus T is compact if whenever 
A is a bounded subset of H, then T(A) is compact. 


Example 2. (a) Compact operators are clearly bounded. 

(b) The identity operator, I, on an infinite-dimensional Hilbert space is never 
compact. The image of the unit ball, which is bounded, is itself. But, 
in infinite-dimensional space, no ball is relatively compact, so I is not 
compact. 

(c) Define T: ? > P as follows: for x = (x,) €?, T(x) = (x1,0,x3,0,%5,...)- 
The set A = {e,,_, : n © N} is bounded, but its image T(A) =A is not 
relatively compact. Hence T is not a compact operator. 


Example 3. A bounded operator on a Hilbert space H is compact if and only if 
T(B) is compact, where B is the unit ball in H 

Suppose T(B) is compact, and let A be a bounded subset of H. Then A is 

contained in a ball B, of radius r and centered at the origin. Because T(B,) = 


rT(B), T(B,) is compact; hence T(A) is compact. The converse is trivial. @ 


Example 4. Define D: ? > P as follows: for an element x = (x,) € 1’, D(x) = 
(x1,X2/2,x3/3,...). Being a closed subset of the Hilbert cube, D(B) is compact. 
Thus D is compact, by the previous example. @ 


Theorem 7.4.1. An operator T € £(H) is compact if and only if, for every bounded 
sequence (x,) in H, (T(x,)) contains a convergent subsequence. 
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Proof. Suppose T is compact, and let (x,,) be a bounded sequence in H, say, ||x,|| <r. 
By assumption, T(B(0,r)) is compact and contains T(x,). By the sequential 
compactness of T(B(0,r)), (T(x,,)) contains a convergent subsequence. 


Conversely, if T is not compact, then there exists a bounded subset A such that 
T(A) is not compact. In particular, T(A) is not totally bounded. Thus there exists 
a positive number € and a sequence (x,,) in A such that ||T(x,,) — T(x,,)|| = € for all 
m,n & N. See the proof of theorem 4.7.6. We have constructed a bounded sequence 
(x,,) for which T(x,) contains no convergent subsequence. Wl 


Example 5. A compact operator T maps weakly convergent sequences into 
(norm) convergent sequences. The converse is also true. See problem 2 at 
the end of this section. 

Let (x,,) be a sequence in H such that x, >” x. We show that every subse- 
quence of (T(x,,)) contains a subsequence that converges to T(x). Pick a subse- 
quence y, = x, of (x,). Since y, >” x and since weak sequences are bounded 
(problem 15 on section 7.2), the previous theorem yields a subsequence z, = 
Ye, Of (yx) such that T(z,) is convergent. Set lim, T(z,) = z. In particular, 
T(z,) >” z. Now z, >" x, so, by problem 6 on section 6.6, T(z,) >” T(x). By 
the uniqueness of weak limits (problem 11 on section 7.2), z= T(x). @ 


Theorem 7.4.2. The set K of compact operators on a separable Hilbert space H is a 
closed subspace of £(H). 


Proof. We leave it to the reader to verify that K is a vector space. To prove that K 
is closed, let T € L(H) be in the closure of K. Let (x,) be a bounded sequence 
in H, and suppose that ||x,||<r. If €>0, there exists a compact operator K 
such that ||T—K|| <e. Since K is compact, a subsequence (y,) of (x,) exists 
such that K(y,) is convergent. In particular, K(y,) is a Cauchy sequence, so 
there exists a positive integer N such that, for m,n > N, ||Ky, — Ky,|| < €. Now, 
for mn>N, ||Tyn- Tymll S ||T¥n — Kyl] + [| Kvn — Ky mll + [Km — TYmll S 
\|T — K\l|ly, {| + € + [|K—T|Ilynl| < re + € + re. Thus Ty,, is Cauchy; hence it is 
convergent. Wi 


Theorem 7.4.3. (a) If T is compact and S € £(H), then ST and TS are compact. 
(b) If T is compact, and H is infinite dimensional, then 0 € o(T). 


Proof. The proof of part (a) is a straightforward application of theorem 7.4.1 and the 
fact that a bounded operator maps bounded sequences into bounded sequences 
and convergent sequences into convergent sequences. To prove (b), suppose 
0 €¢90(T). Then T is invertible, so there exists a bounded operator S such that 
ST =I. By part (a), ST would be compact, so I would be compact, which is false 
by example 2. @ 


322 FUNDAMENTALS OF MATHEMATICAL ANALYSIS 


Definition. An operator T € £(H) is said to be of finite rank if R(T) is finite 
dimensional. 


Theorem 7.4.4. 
(a) A bounded, finite-rank operator T is compact. 
(b) If T is compact and R(T) is closed, then T is of finite rank. 


Proof. (a) Suppose dim(R(T)) < co. The continuity of T implies that T(A) is a 
bounded subset of R(T) for every bounded subset A of H. But bounded subsets 
of a finite-dimensional space are relatively compact by the Heine-Borel theorem. 
This proves that T is compact. 


(b) If R(T) is closed, then it is a Banach space, and T maps H onto R(T). The open 
mapping theorem implies that T is an open mapping. Coupled with the compact- 
ness of T, this implies that the image T(B) of the unit ball, B, in H is relatively com- 
pact and contains a ball B' = {x € R(T) : ||x|| < 6}. In particular, the closed ball 
B’ in R(T) is compact. This cannot happen unless R(T) is finite dimensional. 


Example 6. Let (a;) be an infinite matrix such that wer pa |a;|? < 00, and 
define an operator T on I? as follows: for x = (x,) €?, T(x) = By ajjx;. We 
claim that Tis compact. Observe that the assumptions imply that | ei a,jx;|" < 
Wipe (Fig? Dope PGA = Well Deja ligl?- Also, lim soo Dim ngs Deja [jl = O- 

For n EN let P,, be the projection of ? onto the finite-dimensional subspace 
Span({ey,...,€nt). Thus, for x =(x,) €P, P,(x) = (x,,...,X,50,0,0,...). Since 
P,, is compact, P,,T is compact by theorem 7.4.3. If we show that lim, ||T— 
P,,T|| = 0, the proof will be complete by theorem 7.4.2. Now ||(T— P,, T)x||? = 
Wiens! je Ht? S Liens Ill? D2, lal’. This shows that ||T—P,,T|| < 


Ce) eo) . . oe) co 
Dear a |a,|?. Since lim, Dy 144 Dis |a;|" = 0, we are done. @ 


Not every compact operator is of finite rank. The following theorem provides the 
next best result. 


Theorem 7.4.5. Every compact operator T ona separable Hilbert space H is the limit 
of a sequence of finite-rank operators. 


Proof. Let B be the closed unit ball in H. Since T(B) is relatively compact, for every 
n&N, there exists a finite subset F,, of H such that T(B) C Uyer, BCy, 1/n). Let 
M,, = Span{F,,}, and let P,, be the projection of H onto M,,. Finally, let T,, = P,,T. 
Note that R(T,,) has finite dimension because it is contained in M,,. Thus each 
T,, if a finite-rank operator. We now show that, for x € B, ||T,,.x — Tx|| < 2/n. 
This will prove that lim,, T,, = T. Fix n € N and write F,, = {y,,..., vn}. If x € B, 
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Tx € T(B) C Ux B(y;, 1/n). Thus, for some 1<i<N, ||Tx—y;|| < 1/n. Now 
IT,.x—- yill =I1P (Tx) — PGI = [|Pan(T*— yall S ||Palll| T—yil] = ||Tx— yl] < 
1/n. Finally, ||T,,x — Tx|| < ||T,x — y;|| + |v; — Tx|| < 1/n+ 1/n = 2/n. 


Theorem 7.4.6. Let T € £(T). Then T is compact if and only if T* is compact. 


Proof. Since T** = T, it is enough to show that if T is compact, then T* is compact. 
Let (y,,) be a sequence in the closed unit ball B of H. We show that T*y,, contains 
a convergent subsequence. For eachn EN, define A, € H* by A,(x) = (x,y,,). For 
x,x’ € H, 


lA,(x) — A(x’) = (x -2' yn) S Ile -~'I- 
It follows that the sequence A,, is equicontinuous on H and, in particular, on T(B), 


which is compact by assumption. Ascolis theorem guarantees a subsequence A,,, 


that converges uniformly on T(B). 
Now 


T"Yn, = T'ynll = SUP yepl(X, Tn, —¥n,))| = SUP ceBl(TX, Yn, — In) 
_ sUuPxeBl( TX, Yn) — (T% Yn, | 
= sup yepln (Tx) — Ay (TX): 


The uniform convergence of A, on T(B) guarantees that the last quantity can be 
made less than ¢€ for sufficiently large integers i and j. Thus T*(y,,,) is Cauchy and 
hence convergent. Wl 


The Eigenvalues of a Compact Operator 


Theorem 7.4.7 (the Riesz-Schauder theorem). Let T be compact, and let r> 0. 
Then the set of eigenvalues A of T such that |A| > r is finite. 


Proof. Suppose there exist infinitely many eigenvalues A, of T such that |A,,| > 1. For 
each eigenvalue 2,, choose an eigenvector x,, and let M,, = Span{x,,...,x,}. Note 
that M,, is properly contained in M,,,, and that T(M,,) C M,,. By Riesz’s lemma, 
for every n > 2, there exists a unit vector y, € M,, such that dist(y,,M,_,) > 1/2. 
It is easy to verify that (T—AmDV¥m €Mm—-1. Now if n<m, then Ty, — 
(E= AnD € Mym-1; SO 7D ~~ (= AnD Ym| € Mn-1; and Ty n = Tymll a 


1 . 
Amlll—[Dyn = (T- AmDV in| —Ynll 2 [Ain dist(¥ins Min—1) 2 r/2. Thus (Tyn) 
contains no convergent subsequence, contradicting the compactness of T. 
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Theorem 7.4.8. For a compact operator T on a separable Hilbert space H, 


(a) the set of eigenvalues of T is at most countable and can be arranged as follows: 
|A,| = |A2| > ...5 and the set {A,} of eigenvalues of T has no nonzero limit 
points, 

(b) If T has infinitely many eigenvalues, then lim,,A,, = 0; and 

(c) ifO 4A EC, andL = T—AI, then dim(Ker(L)) < ov. 


Proof. Let A be the set of nonzero eigenvalues of T, and assume A is infinite. Let r be 
the spectral radius of T, and let r, = r/n (n EN). If U,, is the complement in the 
complex plane of the closed disk of radius r,, centered at 0, then C — {0} = UR, U,, 
and A =U7,ANU,,. Since each of the sets A, = ANU,, is finite by theorem 
7.4.7, is countable. If A has a nonzero limit point, z, then z € U,, for some 
positive integer n. Because U,, is open, it contains a disk centered at z, and such 
a disk would contains infinitely many points of A. This would contradict the 
finiteness of A, so no such point z exists. Next letn € N be such that AN U,, # @. 
Since AN U, is finite, the eigenvalues A such that |A| > 1, can be enumerated such 
that |A,| > |A,| = ... |Ay, |. Since A is infinite, there exists an integer m > n such 
that AN U,, properly contains AN U,,. Arrange the eigenvalues in (U,, — U,)NA 
in such a way that |Ay, +1| = |An,+2| 2 --- 2 |An,|. Continuing in this manner, we 
can enumerate all the eigenvalues in the desired fashion. 


(b) Any disk centered at 0 contains all but finitely many of the points A,. This 
proves part (b). 


(c) Write N;, for Ker(L). Note that T(N,) © N;; hence the restriction of T to N, is 
compact. Since Tx = Ax for allx € N,, I= =T on N,. Thus the identity operator 
on N, is compact, so N, is finite dimensional. 


Now that we have established enough of the basic properties of compact operators, 
we give examples of how compact operators can be constructed. We hope this will 
help motivate some of the results we discuss later in the section. The following 
builds on the constructions of theorem 7.3.11. 


Example 7. Let P,, be a sequence of pairwise orthogonal projections, and let A, 
be a sequence of nonzero complex numbers such that lim,,/, = 0. Define T : 
H-> HbyTx= bee A,,P,,x. By theorem 7.3.11, T € £(H). If, in addition, the 
rank of each P,, is finite (i.e., P,, projects H onto a finite-dimensional subspace), 
then T is compact, by theorems 7.4.4 and 7.4.2. By theorem 7.3.11, A,, are all the 
nonzero eigenvalues of T. The importance of theorem 7.3.11 and this example 
is that they not only produce an abundance of examples of self-adjoint and 
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compact operators but also illustrate that we have wide control over tailoring 
the spectrum, as the examples below illustrate. Also see problem 4 at the end 
of this section for an example of the ultimate tailoring of the spectrum of a 
bounded operator. 


Example 8. Let {u,} be an orthonormal basis for H and define T(x) = 
pe A, (X,U,)U,, Where (A,,) is a sequence of nonzero complex numbers such 
that lim,,2,, = 0. This is a special case of the previous example; each P,, is the 
projection on the one-dimensional subspace spanned by u,,. In this example, 
A = 0 is not an eigenvalue of T, as the reader can easily verify. @ 


Example 9. Let {u,} be an orthonormal basis for H and define T(x) = 
pam cs Uy, )Ur,» Where (A,,) is a sequence of nonzero complex numbers 
such that lim,A, =0. In this case, 2=0 is an eigenvalue of T. In fact, 
dim(N(T)) = co since T(uz,,4,) = 0 for all positive integers n. 


The following subsection is independent of the subsequent subsections and can be 
bypassed without loss of continuity. 


The Fredholm Theory 


We will adopt the following standing assumptions for the remainder of this section: 
T is a compact operator on a separable Hilbert space H, and A is a nonzero 
complex number. We also use the following notation: L = T—AI, L* = T* Ah 
N, = Ker(L); R, = R(L); Nyx = Ker(L*); Ry» = RCL"). 

In the calculations in the rest of this section, we repeatedly use the fact that T 
commutes with the powers of L. This is because the powers of T commute. 


Theorem 7.4.9. R, is closed. 


Proof. Let X, be a complement of N, in H. One exists by example 6 in section 6.4. 
We can choose X, = Ny, but we are not making this election because the rest 
of the proof below works well with any complement of N,. We first prove the 
following fact: There exists a constant 6 >0 such that ||Lu|| > d||ul| for every 
u € X;. Suppose not. Then there exists a sequence (u,) in X, such that ||u,|| = 
1, ||Lu,,|| < 1/n. Clearly, Lu, + 0 as n> oo. Since T is compact, Tu, contains 
a convergent subsequence, Tw,. Thus w, = “Tw, — 7LW, is convergent. Let 


w=lim,w,. Since X, is closed, wE X,. Now w=lim,w, = = lim, (Tw, — 
Lw,,) = = Tw. Thus Tw = Aw; hence w € N, NX, = {0}. This contradicts the fact 
that ||w|| = lim, ||w,,|| = 1 and establishes the fact. We now prove that R, is closed. 
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Suppose Ly,, is a convergent sequence in R,. We need to show that lim, Ly, € Ry. 
Write yy = Uy, + Wy, where Uy, € X,, and w, € N;. Note that Ly, = Lu,, so Lu, 
is convergent. By the above fact, ||uUy, — Uy|| < =|[Lu, — Luy,|| > 0 as m,n > oo. 
Thus u,, is a Cauchy sequence, so u = lim u,, exists. Finally, lim, Lu, = Lu. 


Remarks. (a) An immediate consequence of theorem 7.4.9 is that H = Nj» ® R;, 
because, by theorem 7.3.12, Ri = Nie Since R, is closed, R; = Nhe which 
is the result we seek. 

(b) Since L* = T* — AI, and T* is compact, the above theorem implies that N;« 
is finite dimensional and that R,. is closed. As in remark a, H = N; ® Ry. 

(c) By the above remarks, codim(R,) = dim(N,.), and codim(R,.) = dim(N,). 
It is also true that dim(N,) = dim(N,.). It follows that the numbers 
dim(N_), dim(N,«), codim(R,), and codim(R;+) are all finite and equal. 
The proof that dim(N,.) = dim(N,) appears at the end of this subsection. 
See theorem 7.4.15. 


Lemma 7.4.10. Let N;» denote Ker(L"). Then Nj» is finite dimensional, and Nin 
Nin+. Moreover, there exists an integer n such that, for every k >n, Nix = Nj». 


Proof. Observe that 


L"=(T-AD" = > (ran 
i=0 


=(T"—ndT“! +... 4 (AT) —[(- "ta" |. 


The operator K = T" —naT""! +... 4+n(—A)""'!T is compact by theorems 7.4.2 
and 7.4.3. Since (—1)"*'4" #0, the kernel Nin of L" is finite dimensional by 
theorem 7.4.8. The fact that Nin © Njn+i is obvious. Now suppose, for a contra- 
diction, that Nin # Ny»+i for alln EN. By Riesz’s lemma, choose a unit vector 
uy, © Nyt such that dist(u,, Ny») > 1/2. We claim that Tu,, contains no conver- 
gent subsequence, contradicting the compactness of T, and concluding the proof. 
Forn>m, 


Tu, — Tu,, = Au, —(Tu,, — Lu,) (4) 


Now L"(Tu,,—,Lu,) = T(L"u,) — L"*1u, =0 -—0=0. Thus Tu,, — Lu, € 
Ny», and, by (4), ||Tu, — Tu,,|| = |Al|lu, — =(Tu —Lu,,)|| = |A|/2, which is 
the contradiction we seek. In the above computation, we used the fact that T and 
L" commute. @ 


Lemma 7.4.11. Let Ryn = R(T—AD". Then each Ryn is closed, Rtn D Ryn+i, and 
there exists a positive integer n such that Rx = Ryn for all k > n. 
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Proof. As in the proof of the previous lemma, L" = K—[(—1)"*'A"]I, where K is 
a compact operator; hence, by theorem 7.4.9, Ryn is closed. The inclusions Rin D 
Rynvi are obvious. If Ryn # Ryn+1 for all n, then, by Riesz’s lemma, choose a unit 
vector u, © Ryn such that dist(u,,Rypn+1) > 1/2. We claim that Tu, contains no 
convergent subsequence, contradicting the compactness of T, and concluding the 
proof. Forn > m, 


Tuy, — Tuy = AU, — (Tu, — Lu,,)- (5) 


Now Tu,, — Lu,, © Rym+i; hence, by (5), 


1 
[| Tu — Teen] = |All — Ten — Lun) 2 4/2. 
Proposition 7.4.12 (the Fredholm alternative theorem). The operator L is sur- 
jective if and only if it in injective. Symbolically, R, = H if and only if N;, = {0}. 


Proof. Suppose R, =H. If N, # {0}, then there exists a vector Uy #0 such that 
Lug = 0. Since L is onto, there is a vector u, such that Lu, = ug, and, by induction 
there exists a sequence of nonzero vectors U1, Up,... such that Lu; = u;_. Now, for 
all n, L"u, = Up #0, but L"*1u, =0. Thus u, © Nynti — Ny». This contradicts 
lemma 7.4.10. 


Conversely, suppose N; = {0}. Note that Njn = {0}, that is, L" is one-to-one. If 
R, #H, then there is an element x € R,. In this case, for all y € H, L"x —L"*!y = 
L"(x—Ly) #0, because x # Ly and L" is injective. Hence L"x # L"*'y for all 
y EH. This means that Ry» strictly contains Ryn+ for all n, thus contradicting 
lemma 7.4.11. 


Theorem 7.4.13. Let T be compact. IfA 4 0, and A is not an eigenvalue, then T— AI 
is invertible. In other words, all the nonzero elements of the spectrum of a compact 
operator are eigenvalues. 


Proof. Suppose A #0 and that A is not an eigenvalue, that is, N, = {0}. By propo- 
sition 7.4.12, R, = H. Hence L= T—AlI is one-to-one and onto and hence is 
invertible by theorem 6.3.7. 


Remark. It follows from theorem 7.4.8 and the previous theorem that the spec- 
trum of a compact operator on an infinite-dimensional separable Hilbert space 
is {0,2,,A5,...}. 


The following result is an immediate consequence of proposition 7.4.12 and 
theorem 7.4.13. 
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Theorem 7.4.14 (the Fredholm alternative theorem). Let T be a compact opera- 
tor, and let A # 0. Then exactly one of the following holds: 


(a) T—ALis invertible. 
(b) A is an eigenvalue of T. Ml 


We conclude this subsection by furnishing the proof of a result we mentioned 
earlier. 


Theorem 7.4.15. Let T be a compact operator on a separable Hilbert space H, and, 
fora complex number A # 0, let L= T—AI,L* = T* — AL. Then N, and N,» have 
the same (finite) dimension. 


Proof. Suppose, for a contradiction, that dim(N,) =m <n=dim(N;;), and let 
{U),...,U,,$ and {v,,...,v,} be orthonormal bases for N, and N;,« respectively. 
Define a finite rank operator on H by 


F(x) = Ss U;)V;. 


i=] 


Notice that Fu; =v; for 1<i<m and that the restriction of F to N, is one- 
to-one. The operator K=T+F is compact. We claim that K—AI is one- 
to-one. If (K—AD(x)=0, then (T—AI(x) =—Fx € RL NN» = {0}. Thus 
(T—AD(x) = 0 = Fx. In particular, x € N,. Because F|y, is one-to-one, x = 0, 
and this proves our claim. By the Fredholm alternative theorem, K—AlI is 
onto, which is a contradiction because R(K—AID) C R;, @ Span{y,...,v,,} and 
Vna1 FR, B Spant{y,,...,V,,}. This contradiction shows that n<m. By the 
preceding part of the proof and the fact that L** =L, m <n, and the proof is 
complete. 


The Spectral Theorem 


The discussion so far shows that compact operators, like self-adjoint operators, 
share some properties with operators on finite-dimensional spaces. When we limit 
our attention to compact, self-adjoint operators, we obtain results that directly 
extend those of the finite-dimensional case. 


Lemma 7.4.16. If T is a nonzero compact, self-adjoint operator on a separable 
Hilbert space, then ||T|| or —||T|| is an eigenvalue of T. In particular, every nonzero 
compact, self-adjoint operator has a nonzero eigenvalue. 
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Proof. By the proof of theorem 7.3.8, there exists a real number A and a sequence of 
unit vectors (y,,) such that |A| = ||T|| and lim, Ty, — Ay, = 0. Since T is compact, 
(y,) contains a subsequence (u,) such that (Tu,) is convergent. It follows that 
(u,,) is convergent (it is the difference between the two convergent sequences 
~Tity and =[Tu, —Au,]|). Let u=lim,,u,. Now lim, Tu, —Au,, = 0; hence 
Tu—Au= 0. Since u £0, A is an eigenvalue of T. 


In light of theorems 7.3.8 and 7.4.13, r(T) = ||T|| = |A,| (the largest eigenvalue of 
T). Thus the previous lemma is, in fact, redundant. However, we decided to include 
it here in order to make this subsection self-contained and independent of the 
Fredholm theory. 


Theorem 7.4.17 (the Hilbert-Schmidt theorem). Let T be a compact self-adjoint 
operator on a separable Hilbert space H. Then H possesses an orthonormal basis 
of eigenvectors of T. 


Proof. Let A,,A,,... be the nonzero eigenvalues of T, and, for each n, let B, be an 
orthonormal basis for the (finite-dimensional) eigenspace, V,,, that corresponds to 
A, The reader should keep in mind that the set of eigenvalues may be finite. Since 
the eigenspaces are mutually orthogonal, the set B=U,B,, is an orthonormal 
set. Let M be the closure of the span of B, and let N= Mt. Since each V,, is T- 
invariant, so is M. It follows that N is also T-invariant (see problem 13 on section 
7.3). If N = {0}, then M = H and B is the desired orthonormal basis for H. 


If N # {0}, the restriction of T to N is compact and self-adjoint. If T| is not the 
zero operator, then, by lemma 7.4.16, T|y has a nonzero eigenvalue A, which is also 
an eigenvalue of T. Since the set {A,,A,...} contains all the nonzero eigenvalues 
of T, A=A, for some n. This is a contradiction because then an eigenvector v 
that corresponds to A would be in MAN = {0}. This shows that N C Ker(T). In 
particular, Ker(T) 4 {0}; hence A =0 is an eigenvalue of T. Now we show that 
N= Ker(T). If x € Ker(T), then by the orthogonality of eigenvectors belonging to 
distinct eigenvalues, x L u for every u € B. Thus x € M+ = N, and N = Ker(T). 
Now choose an orthonormal basis By of N. The set B U By is an orthonormal basis 
of M ® N = H consisting entirely of eigenvectors of T. 


We now arrive at the spectral theorem for compact self-adjoint operators. 


Theorem 7.4.18 (the spectral theorem). Let T be a compact self-adjoint oper- 
ator on a separable Hilbert space H, and let {u,} be an orthonormal basis of 
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eigenvectors of T corresponding to the eigenvalues {A,,}. Then, for every x € H, 
Tx= » A, AX, Un) Uy 
n=1 


Proof. Write x = eC U,)U,. Then 


Tx= 1( Ses un] = Dati) Tuy = > Ane u,)U,. 
n=1 n=1 n=1 


The spectral theorem is the exact analog of the finite-dimensional case for a 
Hermitian matrix. If we define P,, to be the projection on the one-dimensional 
subspace spanned by u,, then P,, is a rank-1 operator, and T = pe A,,P,. Notice 
that the series ae A,P,, converges in the operator norm by theorem 7.3.11. 


Remarks. 1. If 2 =0 is an eigenvalue of T, then 2 =0 contributes nothing to 
the sum in the statement of theorem 7.4.18. Consequently, if ye R(T), 
foe} . . . 
then y = }) _,(,4n)Un» Where the series involves only the eigenvectors that 
correspond to the nonzero eigenvalues. 
2. The proof of theorem 7.4.17 and remark 1 reveal that T projects H onto the 
orthogonal complement of V(T), which is nothing other than the closure of 
the span of the eigenvectors that belong to the nonzero eigenvalues of T. 


Example 10. Let T be a compact self-adjoint operator, let {2,,} be the nonzero 
eigenvalues of T, and let u,, be the corresponding eigenvectors. For a fixed 
g €H, consider the equation Tf — Af = g. We work out two cases: 


(a) Suppose A ¥ 0 is not an eigenvalue. In this case, T — AI is invertible, and the 


equation has a unique solution f. To find f, observe that the equation can be 
written as Tf = Af+ g; hence Af + g € R(T). Remark 1 implies that 


Af+ g= Di AF+ 8 Un)Un = >) [Af t+ Bnlune (6) 
n=1 n=1 


By theorem 7.4.18, 


Tf= >) AdaMn (7) 
n=1 


Equating the Fourier coefficients of the two series in (6) and (7), we obtain 


Ain =Afr +B, which gives f, = i, and the unique solution of the 


equation is 
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(b) IfA =A,, and A, is a simple eigenvalue with eigenvector v, then the equation 
has a solution if and only if (g,v) = 0 and, in this case, the solutions are of the 
form 


AAG Un ‘ . 
f=-2+avt S) eas : k # n}, where a is an arbitrary scalar. 
An An (Aan) 


To see this, one duplicates case (a) to obtain the equations 
Aide = Antic + & for all kEN (8) 


when k = n, the equation 1,f,, = Anfy +, is satisfied if and only if g, =0. In 
this case, f,, is arbitrary and, for k #n, f;, is uniquely determined by equation 
(8), so we have arrived at the stated solution. ¢ 


See problem 8 at the end of the section for the continuation of this example. 
We are now ready to prove the spectral theorem for compact normal operators. 


Theorem 7.4.19. Let T be a compact normal operator on a separable Hilbert space 
H. Then H possesses an orthonormal basis of eigenvectors of T. 


Proof. Consider the self-adjoint operator U= TT* = T*T. Observe that T and U 
commute: TU = TT*T = T*TT = UT. 


We show that if Ay = 0 is an eigenvalue of U, then Ag is an eigenvalue of T and 
Ker(T) = Ker(U). If U(x) = 0 then 


(SEG) ||? Cie, Be) = Gee TEx) = 1x0) = 1x, 0) = 0: 


Conversely, if Tx = 0, then U(x) = T*(T(x)) = 0. Now let A,,A,,... be the dis- 
tinct nonzero eigenvalues of U, and let V,,V3,... be the corresponding finite- 
dimensional, mutually orthogonal eigenspaces. We show that each V,, is T- 
invariant. If x E V,, then (U—A,D(Tx) = (UT—A,T)(x) = (TU -2,,T)(x) = 
T(U—A,,D(x) = T(0) = 0. Thus Tx € V,,. Now T|y,, is a normal operator on the 
finite-dimensional space V,,. By theorem 3.7.15, V,, has a basis B,, of eigenvectors 
of T. Choose an orthonormal basis By for Vy = Ker(T) = Ker(U). By theorem 
7.4.17, H = Span{UpoV,}. Since V,, = Span(B,,), USoB, is an orthonormal 
basis for H. 


Example 11. Let T : ? > I be the finite-rank operator 


T(X1, Xp, «--) = (ix, —ixy, (1 + i)x3, (1 — 1)x4, 0, 0,...). 
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It is easy to verify that T*(x,,Xp,...) = (—ix,, ix,,(1 — i)x3,(1 + )x4,0,0,...) and 
that T is a normal operator. The self-adjoint operator U = T*T is given by 


U(x1,X2, +.) = (1, X2, 2X3, 2X4, 0, 0,...). 


The three eigenvalues of U are Aj = 0, A, = 1, and A, =2 with eigenspaces 
Ker(U) = {(0,0,0,0,x;5,%¢,...)}, V; = Spanfe,,e,}and V, = Spanfe;,e,}, respec- 
tively.” 

The nonzero eigenvalues of T are +i and 1 +i, and the corresponding eigen- 
vectors are €), €), €3, and e4, respectively. Also, aj = 0 is an eigenvalues of T 
and Ker(T) = Ker(U). In the notation of the previous theorem, By = {es, é¢,...}, 
By = {e,}, By = {ea}, Bs = {es}, and By = {ey}. 


Excursion: Integral Equations 


The theory of compact operators has deep roots in the study of integral equations, 
and this section would not be complete without a brief mention of integral 
equations. 

Consider the Fredholm integral equation 


Tu-—Au=f, 


where T is the integral operator generated by the function K(x, &), 


b 
Tulx) = i K(x, E)u(Odé. 


The complex function K(x, €) on the square [a, b] x [a, b] is called the kernel of the 
operator, and we limit ourselves to Hilbert-Schmidt kernels since these, as it turns 
out, define compact integral operators on 2? = 2?[a, b]. 


Definition. The function K(x,&) is said to be a Hilbert-Schmidt kernel if 
SS, \K(x, 8)? dxdé < 00. 
We now prove that a Hilbert-Schmidt kernel generates a compact integral 
operator on 2”, and we achieve this in a number of steps. 


Theorem 7.4.20. If K(x,&) is continuous on the closed square [a,b] x [a,b] and 
u € &?, then Tu is continuous on [a,b]. 


> The set {e,,} is the canonical basis for K(N). 
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Proof. Let ¢€ > 0. By the uniform continuity of K on [a,b] x|a,b], there exists a 


number 6 > 0 such that if |x, — x2| < 6, then |K(x,,6) — K(x2,)| < €. Now using 
the Cauchy-Schwarz inequality, 


|Tu(x,) — Tu(x,)| = 


b 
i} (K(x, €) — Klas, uae 


b 
< [ (K(x) — K(x, E)u(E)|aé 


<( i “KG, KG, Pas) | ( / “w@ra) 
b 1/2 


<([ edt) [lulls = €(b — a)" ||uII2. 


/2 


This proves the continuity of Tu. 


Corollary 7.4.21. If K is continuous on [a,b] x [a, b], and § is a bounded subset of 
L, then T(%) is equicontinuous. 


Proof. This is obvious from the proof of the previous theorem since if ||u||, <C 
for all u€ &, then |Tu(x,) — Tu(x,)| < C(b—a)"e for all x,,x. € [a,b] with 
|x; —x,|<6. 


Theorem 7.4.22. If K(x, €) is continuous on [a, b] x [a, b], then the integral operator 
it generates is a compact operator on &?. 


Proof. This result follows from the previous corollary and Ascoli’ theorem. If {u,,} is 
a bounded sequence in &°, then T(u,,) is equicontinuous and bounded in C[a, b]; 
hence it contains a subsequence Tu,, that converges uniformly in Cla, b]. Since, 
for any function u € C[a, b], ||ul|, <(b—a)"/||ul|,,, the subsequence Tu, is 
convergent in 2°. 


We now prove the result we seek. 


Theorem 7.4.23. If K(x, &) is a Hilbert-Schmidt kernel, then the integral operator it 
generates is compact. 


Proof. We utilize the fact that C([a,b]x[a,b]) is dense in &?([a,b] x [a,b]). 
Let K,(x,€) be a sequence of continuous functions on [a,b] x [a,b] such that 
lim, ||K,, — K||, = 0. It suffices to show that if T,, is the compact integral operator 
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generated by K,, then lim,,||T,, — T|| = 0 in L(2?). Now 


b b 
T,4— Tull3 = i} | | (Ky(x, &) — K(x, €))ulE)d€ Pd 
b b 
< | IK (x, £) — K(x, €) Pde dx [ Iu(E) PAE = ||K, — K1l3|lull3. 


Thus ||T,, — T|| < ||K, —K|2 ~ 0 asin > oo. 


Example 12. Consider the Hilbert-Schmidt kernel K(x, €) = cos xcos €,0< x,§ < 
m, and let T be the corresponding integral operator. 

If A £0 is an eigenvalue of T and u is the corresponding eigenvector, then 
(Tu)(x) = cos cael cos €u(€)d& = Au = Au(x). It follows that u is a multiple of 
cos x, the only nonzero eigenvalue is A, = 7/2, and the normalized eigenvector 
is u,(x) = /2 cos x. In this case, 0 is an eigenvalue of infinite multiplicity, and 
the null-space of T is the orthogonal complement of the one-vector set {cos x}. 
Using the result of example 10, we find the solutions of the equation Tf— Af = g 
in two cases: 

—g(x) c cos x 


a - (= -A) 


(a) 042A 7/2. The unique solution of the equation is f(x) = 


where c = Sy o(&)cos Edé. 
(b) The equation Tf— “ -f = ghas a solution if and only if (g, cos x) = 0, and, in 
—2¢(x) 
7m 


> 


this case, f= +k cos x, where k is an arbitrary constant. @ 


Exercises 


1. Prove that K is a subspace of £(H). 

2. The following characterization of compact operators is sometimes handy. 
Let T € £(H). Prove that T is compact if and only if x,, >” x, implies that 
Tx, converges in the norm to Tx. Hint: If x, >” x then Tx, >” Tx. Now 
see problem 10 on section 6.7. 

3. Let {A,,} be a bounded sequence of complex numbers, and let {u,} be 
an orthonormal basis for H. Prove that the function T : H > H defined 
by Tx = ye U,)U, is a bounded operator. Also show that T*x = 
y ‘ A AX, u,)u, and hence T is self-adjoint if and only if each A,, is real. 
Finally, show that {/,,} are all the eigenvalues of T. 

4. This is a continuation of the previous exercise. Show that every compact 
subset C of the complex plane is the spectrum of a bounded operator. Hint: 


10. 


11. 
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Let {A,,} be a dense subset of C, and define T as in the previous exercise. 
Since {A,,} C a(T), C C a(T). To show that o(T) C C, let A € C, and show 
that (T—AD7!x= eee — A)“ "x, Up) Up- 


. In this exercise, we construct a compact operator which has no eigenvalues. 


Define a bounded operator T : ? > P by T(x) = (0, x), x,/2,x3/3, ...). Show 
that T is compact and that o(T) = {0}. Hint: T= RoD, where R is the right 
shift operator and D(x) = (x;,x,/2,...). D is compact by example 4. Show 
directly that T has no eigenvalues. 


. Give a direct proof of the following theorem without using the subsection 


on the Fredholm theory. Let T be a compact self-adjoint operator on a 
separable Hilbert space. Prove that the nonzero points of the spectrum of 
T are eigenvalues. Hint: Use problem 15 on section 7.3, and examine the 
proof of theorem 7.4.16. 


. Let T be a compact self-adjoint operator on a separable Hilbert space H. 


Prove that there exists a set of orthonormal vectors {u,,} corresponding 
to nonzero eigenvalues {A,,} such that every element x € H can be written 
uniquely as x = ye u,)Uu, + Vv, where v € N(T). Some books refer to 
this result as the Hilbert-Schmidt theorem. It is clearly equivalent to 
theorem 7.4.17. 


. Let Tbe a compact self-adjoint operator, let {A,,} be the nonzero eigenvalues 


of T, and let u, be the corresponding eigenvectors. For a fixed g € H, 

consider the equation Tf— Af = g. 

(a) Prove that if A=A,, and A, has multiplicity m, with eigenvectors 
V1 +++» Vm» then the equation has a solution if and only if (g, v;) = 0 for 
all 1 <i< mand, in this case, the solutions are of the form 


ae age av; +>if ae at tk# nh. 


Here qy,...,@,, are arbitrary scalars. 


(8Un) 


(b) Prove that if A = 0 is not an eigenvalue, and 1. a < oo, then 


f= a 


(c) What can soe say ater the case when A = 0 is an eeene of T? 


mila 


. Let Tbe a self-adjoint operator. Show that if T* is compact for some integer 


k > 2, then T is compact. Hint: It is enough to show that Tis compact. 
Let K(x, €) = sinx cos€,0 < x,€ < a. Show that A = 0is the only eigenvalue 
of the corresponding integral operator. 

Let K(x, &) be a Hilbert-Schmidt kernel, and let T be the corresponding 
integral operator. 
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(a) Show that if K(x, §) = K(&,x), then T is self-adjoint. 


1/2 
(b) Show that ||TI| < [IKllo = {iS K(x, E)axas| 


12. For a fixed 0<k <1, let K(x,§) = mar 0 <x,& <k, and let T be the 
corresponding integral operator. Show that ||T]| < . log") = : tan~'k. 


13. Let K(x,€) =1+ sin x sin €,0<x,€ < 27, and let T be the corresponding 
integral operator. Find all the eigenvalues of T. Then find the solution of the 
integral equation iG + sinx sin €)u(€)d& = Au(x) + x, where is not an 
eigenvalue of T. Hint: IfA 0 is an eigenvalue of T, then the corresponding 
eigenfunction must be of the form u(x) = A + B sin x. 

14. Let K(x, €) = cos(x — §), 0 < x,€ < 27, and let T be the corresponding inte- 
gral operator. Find all the eigenvalues of T, and describe the corresponding 
eigenspaces. 

15. Let K(x, &) be a Hilbert-Schmidt kernel, and let T be the corresponding 
integral operator. Show that T?u(x) = f° : K(x, €)u(&)dé, where K,(x, §) = 
S’ K(x, )K(t, €)dt. In general, show that if K,,(x,£) = J’ Ky,1(x, OK(t dt, 
then T"u(x) =f” K,(x, E)u(é dé. 

16. Let K(x, &) be a Hilbert-Schmidt kernel, and let T be the corresponding 
integral operator. Show that if |A| > ||T||, then the function F : 2? > 
defined by F(u) = =[Tu — f| is a contraction on &?. In this case, show that 


the solution of the equation Tu — Au = fis = ae = 


17. Let K(x,§) = x&,0 < x,& < 1. Show that K,,(x, €) = x€/3""!. Also show that 
if |A| > 1/3, then the solution of the integral equation Tu — Au = fis u(x) = 


—f x ul 
4 Fae OE. 


7.5 Compact Operators on Banach Spaces 


The reader may have observed that the definition of a compact operator makes 
perfectly good sense for an operator on a Banach space. We state the definition 
again. A linear operator on a Banach space X is compact if it maps bounded 
subsets of X into relatively compact subsets of X. All the results in theorems 7.4.1 
through 7.4.15 are valid for compact operators on Banach spaces. All the proofs 
we presented for theorems 7.4.1 through 7.4.15, are valid without alteration for 
compact operators on Banach spaces, with the exception of theorems 7.4.5, 7.4.6, 
and 7.4.15. The proofs of theorems 7.4.1 through 7.4.15 (with the exceptions noted 
above) were deliberately made more general than is needed for Hilbert spaces. For 
example, we used Riesz’s theorem at several places when a simpler alternative was 
available. As an illustration, in the proof of lemma 7.4.10, we could simply choose a 
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unit vector u,, © Nyn+i such that u,, L Nz». Another place where the proof could be 
simplified is theorem 7.4.9, where we could have used the orthogonal complement 
Nt instead of a complement X, of N;. We now furnish the proofs of theorems 7.4.5, 
7.4.6, and 7.4.15 for compact operators on Banach spaces. 


Lemma 7.5.1. Let T,, be a sequence of bounded operators on a Banach space X, and 
let T be a bounded operator on X such that lim, T,,(x) = T(x) for every x EX. 
Then, for every compact subset K of X, T,, converges uniformly to T on K. 


Proof. By the Banach-Steinhaus theorem, sup,||T,,||< oo. Choose a constant 
M>0 such that M> ||T|| and M> sup,||T,||. Suppose, for a contradiction, 
that there exists a compact subset K of X on which (T,,) does not converge 
uniformly to T. Then there exists a sequence (x,) of K, a subsequence (S,,) 
of (T,,), and a positive number € such that ||S,(x,) — T(x,)|| > ¢€ for every 
néN. By the compactness of K, (x,) contains a convergent subsequence (y,,). 
Let y= lim, yn. Now ||Sun) — TOMI S USnQ) — TOM + [Sn -— DOn — IIS 
[IS,0) — TO)|| + 2M\lyq—yl| + 0. This contradicts ||S,(7,)—TOI| > € and 
concludes the proof. 


The following is a partial generalization of theorem 7.4.5. 


Theorem 7.5.2. If a Banach space X has a Schauder basis, then every compact 
operator T on X is the limit of a sequence of finite-rank operators. 


Proof. Let {u,} be a Schauder basis for X, and let P,, be the canonical projection of 
X onto Span{uy,...,U,} (see the definition before problem 25 on section 6.2). We 
prove that the sequence T,, = P,,T of finite-rank operators converges in £(X) to T. 
For every x € X, lim, (P,, — D(x) = 0. By the previous lemma, sequence (P,, — 1) 
converges uniformly to 0 on compact subsets of X. In particular, P,, — I converges 
uniformly to 0 on T(B). Now ||T,, — Tl] = supyel| Tn) — T)|| = suPxesll(Pn - 
D\(Tx)|| > 0. Hf 


Theorem 7.5.3. A bounded operator T on a Banach space X is compact if and only 
if T* is compact. 


Proof. Suppose T is compact. The proof that T* is compact is a slight modification 
of the proof of theorem 7.4.6. Let B and B* denote the closed unit balls in X and 
X*, respectively. We need to show that T*(B*) is relatively compact in X*. Let (A,,) 
be a sequence of functionals in B*. For x,x' € X, |Ay(x) —A,(x’)| < ||x — x’ ||. It 
follows that the sequence (A,,) is equicontinuous on X and, in particular, on T(B), 
which is compact by assumption. Ascolis theorem guarantees a subsequence (A, ) 


of (A,,) that converges uniformly on T(B). Now 
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|T*A,,, - T*A,,|| = SUP xep| (Xs PUG a An,))| a SUP rep|(TX,Ay, ~ An,)| 
= SUP cepl(TX,An,) = (Tx,An,)| 
= supsep|Ay, (TX) — An (TX): 


The uniform convergence of A, on T(B) guarantees that the last quantity can be 
made less than ¢€ for sufficiently large integers i and j. Thus T*A,,, is Cauchy and 
hence convergent. 


We now prove that if T* is compact, then T is compact. If T* is compact, then, by 
the first part of the theorem, T** is compact. Let B be the image of B under the 
natural embedding of X into X**. By the compactness of T**, T**(B**) is compact, 
and hence its subset T**(B) is also compact. By theorem 6.6.3, T**(B) = (T(B)). 
Therefore (T(B)) is compact, and hence T(B) is compact since it is isometric to 
(T(B)). This proves the compactness of T. Ml 


Using annihilators is not as simple as using orthogonal complements, for the 
simple reason that the annihilator of a subspace M of a Banach space X resides 
in a different space, X*. Thus the fact that H = R, @ N;» makes no sense if H 
is replaced with a Banach space X. However, we will generalize the fact that 
the dimensions of the spaces N;,N,»,X/R,, and X*/R,. are all finite and equal 
(theorem 7.4.15). We adopt the standing assumption that T is a compact operator 
on X, and use the notation that A is a nonzero complex number, L = T—AI, 
L* = T* —Al, N, = Ker(L), R, = R(L), Np» = Ker(L*), and Ry» = R(L*). 


Recall that we have already established (theorem 7.4.8) that N; and N;. are finite 
dimensional and that R; and R;. are closed (theorem 7.4.9). 


Lemma 7.5.4. dim(X/R,) < dim(N,). 


Proof. Let x,,...,X, € X be such that x; = x; + Ry, are linearly independent in X/R,. 
Then, for each i, x; € Ry + Span{xy,...,Xj-1,Xi415-- Xn} = Mj. Because the spaces 
M, are closed (see problem 18 on section 6.6), there exist bounded linear func- 
tionals A,,...,A, € X* such that Aj(x;) = 1,A,(M,) = 0. Clearly, {Ay,...,a,,} are 
independent in X* (reason: A,(x;) = 6,), and since A(R,) = 0,4; € Rt = Nyx. 
Thus dim(X/R,) < dim(N,-). 


Theorem 7.5.5. dim(X/R,) = dim(N;;). 


Proof. Since X/Rz, is finite dimensional by the previous lemma, dim(X/R,) = 
dim(X/R,)*. Applying theorem 6.6.7 with M = R,,(X/R,)* is isometrically 
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isomorphic to Rt =N,+. Thus dim(X/R,)* = dim(N,.), and dim(X/R,) = 
dim(X/R,)* = dim(N,.). 


Theorem 7.5.6. dim(X/R,,) = dim(N;). 


Proof. Let y,...,¥, be such that ¥; = y;+R, form a basis for X/R,, let x1,...,%m 
be a basis for N,, and let X, be a closed complement of N,. We will show 
that m=n. Suppose that m <n. Define a finite-rank operator F € £(X) by 
Fly, =0,Fx; = y; for 1<i<m. The operator K=T+F is compact, and we 
claim that K—AlI is one-to-one. If (K—AI)(x) = 0, then (T—AD(x) = —Fx € 
R, A Span{y,,...,¥n} = {0}. Thus (T-—ADx =0= Fx, and hence x € N,. The 
restriction of F to N, is clearly one-to-one; hence Fx =0 implies that x =0, 
and we have proved the claim. By the Fredholm alternative, K—AI is onto, 
which contradicts the fact that V4, is not in the range of K—AI. (Note that 
R(K-AD CR, @ Spanty,, ...,V_}). 


We have proved that m>n. If m>n, define a finite rank operator F by F|x, =0, 
Fx; =y; for 1 <i<n, and Fx;=y, forn<i<m. In this case, K—AlI is onto 
(note that RK — AI) = Ry, ® Spanty,, ..., Vn} = X) and hence one-to-one by the 
Fredholm alternative theorem. But this contradicts the fact that (K—AD(x,) = 
Fx, = FxXy4) = (K-AD(%,4). Therefore m =n. 


The following result follows immediately. 
Theorem 7.5.7. The following numbers are finite and equal: 


dim(N,), dim(X/R,), dim(N,. ), and dim(X*/R,.). 


Exercises 


1. Find an example of an unbounded, finite-rank linear operator on a Banach 
space. 

. Verify the details of the proof of theorem 7.5.6. 

. Let X = @[0, 1], and define (Tu)(x) = 1 u(t)dt. Prove that T is compact. 

. Let X = C[0, 1], and define (Tu)(x) = zh e“u(t)dt. Prove that T is compact. 


. Let X= C[—1,1], and define (Tu)(x) = ae eas: dé. Prove that T is 


a FF wh 


compact, and estimate || T]|. 
6. Let X = @[—1,1], and define (Tu)(x) = xfs Eu(é)dé. It is clear that T is 
compact. Show that A = 0, and A = 2/3 are the only eigenvalues of T and 


that if0 42 # 2/3, (T-AD|f= = = Ve ERE) dé +f) Is it true that 


3-2 °~ 
X is the direct sum of the two eigenspaces? 


8 
Integration Theory 


‘The only teaching that a professor can give, in my opinion, is that of thinking in 
front of his students. 
Henri Lebesgue 


Henri Lebesgue. 1875-1941 


Lebesgue entered the Ecole Normale Supérieure in Paris in 1894 and was awarded 
his teaching diploma in mathematics in 1897. He studied Baire’s papers on dis- 
continuous functions and realized that much more could be achieved in this area. 
Building on the work of others, including that of Emile Borel and Camille Jordan, 
Lebesgue formulated measure theory, which he published in 1901. He generalized 
the definition of the Riemann integral by extending the concept of the area (or 
measure), and his definition allowed the integrability of a much wider class of 
functions, including many discontinuous functions. This generalization of the 
Riemann integral revolutionized integral calculus. Up to the end of the nineteenth 
century, mathematical analysis was limited to continuous functions, based largely 
on the Riemann method of integration. 


Fundamentals of Mathematical Analysis. Adel N. Boules, Oxford University Press (2021). © Adel N. Boules. 
DOI: 10.1093/0s0/97801 98868781 .003.0008 
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Hawkins writes,' 


In Lebesgue’s work ... the generalized definition of the integral was simply 
the starting point of his contributions to integration theory. What made the 
new definition important was that Lebesgue was able to recognize in it an 
analytic tool capable of dealing with—and to a large extent overcoming— 
the numerous theoretical difficulties that had arisen in connection with Rie- 
mann’s theory of integration. In fact, the problems posed by these difficulties 
motivated all of Lebesgue’s major results. 


After he received his doctorate in 1902, Lebesgue held appointments in regional 
colleges. In 1910 he was appointed to the Sorbonne, where he was promoted to 
Professor of the Application of Geometry to Analysis in 1918. In 1921 he was 
named as Professor of Mathematics at the Collége de France, a position he held 
until his death in 1941. He also taught at the Ecole Supérieure de Physique et de 
Chimie Industrielles de la Ville de Paris between 1927 and 1937 and at the Ecole 
Normale Supérieure in Sévres. 


Lebesgue did not concentrate throughout his career on the field which he started. 
He also made major contributions in other areas of mathematics, including 
topology, potential theory, the Dirichlet problem, the calculus of variations, set 
theory, the theory of surface area, and dimension theory. 


8.1 The Riemann Integral 


In this section, we treat the definition and the fundamental properties of the 
Riemann integral of a bounded function on a compact box. The main reason for 
the inclusion of this section is that our definition of Lebesgue measure is, loosely 
stated, based on the notion that the Riemann integral of a continuous function f 
on a compact box measures the volume of the region below the graph of f. The 
presentation in this section is standard and reflects almost exactly the standard 
approach to the Riemann integral on a compact interval found in undergraduate 
real analysis textbooks. 


Let I= [a,b] be a compact interval. A grid in I is a sequence of points xy) =a < 
X, <X_ <1. <x, =D. 

Each grid in I defines a partition of I into a finite set of closed intervals 
P = {[xo, x1], [x1,x2],..., [xx_1,x4]}. We make no distinction between a grid in 


’ T. Hawkins, “Lebesgue, Henri Léon’ in C. C. Gillispie, F L. Holmes, and N. Koertge (eds.), Complete 
Dictionary of Scientific Biography (Detroit: Charles Scribner’s Sons, 2008), 110-12. 
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I and the partition it generates. We also denote a partition of I by the sequence 
that defines it, {xo,...,x,}. We say that a partition P’ = {yo,...,yn} is a refinement 
of a partition P = {xo,...,x,} if {xo,...,x~} C {yo.--- Ym} This simply means that 
P’ is obtained from P by inserting additional grid points between some (or all) 
consecutive points x; and x;,,. Note that if P’ is a refinement of P, then every 
interval in P is the union of intervals in P’. If P and P’ are partitions of [a,b], 
then P and P’ have a common refinement, namely, the partition generated by the 


grid {x9,..., xk U L¥o, Vb: 


Let I,,...,1,, be compact intervals. The closed box in R” determined by Jj,..., I, 
isQ=I,xX...xI,,. Thus if J; = [a;,b;], then Q = {x = (%},...,X,) 1 a; <x; <b}. By 
definition, the volume of the box Q is vol(Q) = IL. 0: —a;). It is easy to show 
that diam(Q) = (1G - a)” . Nowif, for each 1 <i <n, Pisa partition of I,, 
then the corresponding partition of Qis A = R x ...x P,. We often use the notation 
o to denote a typical sub-box in A. Thus we use the following notation to denote 
the partition of Q generated by &,...,P: 


A={o=]iX-_XaxJ, > J ER}. 


By a refinement A’ of A, we mean a sequence of refinements #’,...,P’ of R,..., 8; 
respectively, and 


A’ ={o' =], xX... XJ, J, € Bh. 


Again, if A’ is a refinement of A, then every sub-box a in A is the union of sub- 
boxes {a/,...,0,}in A’ such that vol(c) = Pah vol(a;). 


Now let f be a bounded real-valued function on Q, and let A be a partition of Q. 
Let the sub-boxes in A be enumerated as 0),...,0,%. We use the notation 


f= supyeo,f(®), and fo, = infrea,f(0). 


Both numbers are finite because fis assumed to be bounded. We define the upper 
and lower Riemann sums, respectively, of f corresponding to the partition A on 
Qby 


K K 

SA) = >i fivol(a;), and sa(f) = >) fe, vol(a;). 
i=1 i=1 

Clearly, sa(f) < SAW: Since fis bounded, there exist real numbers m and M such 

that, for every x € Q,m < f(x) < M. For an arbitrary partition A of Q, f7' > m, so 

Af) >m ye vol(g;) = m vol(Q). Thus the set {S4(f)} of upper Riemann sums is 
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bounded below, and hence the number f = inf,S4(f) is finite, where inf is taken 
over all partitions A of Q. 

Similarly, a = supasa(f) is a finite number. The numbers a and £ are called, 
respectively, the lower and upper Riemann integrals of f over Q. 


Definition. A bounded function fon a box Q is Riemann integrable over Q if 
a = B. In this case, we use the notation f Q f(x)dx to denote the common value 
of a and £, and we call this number the Riemann integral of f over Q. 


An important property of refinements is the following: if A’ is a refinement of 
A, then 


SA’(f) < SAA), and sar(f) > sa(f). 


The reason is as follows: Consider the contribution f% vol(g;) of one sub-box a; to 
the upper Riemann sum of f corresponding to the partition A. Since g; is the union 
of sub-boxes o7,...,0; in eae: = SUP eq! f(X) < sup xeo, f(x) =f. Therefore the 
sum of the contributions of oj,...,0/ to the upper Riemann sum corresponding 
to A’ is ye SUP veo! fi vol(a/) < f% pan vol(a}) =f? vol(g;). This shows that 
SA'(f) < SAP). The fact that s,/(f) > sa(f) is justified using a similar estimate. The 
reader can now check that, for any two partitions A, and A, of Q, 


sa, (f) $ S™(f). 
See problem 2 at the end of this section. Therefore, a < f. 


Theorem 8.1.1. A bounded function f on a box Q is Riemann integrable if and only 
if, for every € > 0, there exists a partition A of Q such that S4(f) — sa(f) < €. 


Proof. Let € > 0, and let A be such that SA(f) —sa(f) < €. Now sa(f)<a<B< 
SA(f). Therefore B <a+e. Since ¢€ is arbitrary, B <a, and hence a = B. Con- 
versely, if a = B, and € > 0, then there exist partitions A, and A, of Q such that 
SS —a <¢/2, and a—sy, <€/2. Let A be a common refinement of A, and Ay. 
Then SA(f) < S™(f), and sa(f) = sa,(f). Therefore, SA(f) — sa(f) = S4(f) — a + 
a—sa(f) <¢/2+¢/2=€. 


Example 1. If f : [a,b] > R is integrable, then |] is integrable. 
Let A be a partition of [a, b], and let a; be one of the subintervals in A. It is easy 
to see that, for x,y € 9; |f(x)| — |f)| <f?' —f,- It follows that |f|°' —|fle, < 
P= fos hence SAF) —sa(f) < SAQ) —sa(f). Since fis integrable, there is a 
partition A such that S4(f) — sa(f) < ¢. The result now follows from theorem 
8.1.1.4 
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Example 2. Under the assumptions of example 1, f° is integrable. 
Since f° = | f|?, we may assume that fis a nonnegative function; hence 


CF) = (moe and Ce = (fo,)°- 
Now 
(f)% - Pa, = ee ade —fo,) < 2M(f? — fo;)s 


where M is an upper bound of f on I. The result now follows from 
theorem 8.1.1. @ 


Example 3. If f and g are integrable on an interval [a, b], then so is fg. 
By problem 1 at the end of this section, the functions f+ g and f—g are 
integrable. By example 2, fg is integrable since fg = Alt +g) —(f—g)"].¢@ 


Example 4. The converse of the result in example 2 is false. 
For example, the function 


1 ifxEeQ, 
TONS Vise egy 


is not integrable on [0,1], but f is. Ml 


Now we consider a special sequence of partitions of Q that is very useful in proving 
results, especially when fis continuous. As before, Q = I, X...XI,. For eachk EN, 
and for 1 <i<n, let # be the partition of I; into 2* subintervals of equal length 
sn and let A, be the corresponding partition of Q. This is the construction 
described earlier except that each of the intervals [,,..., I, is divided into the same 
number of congruent subintervals, which is a power of 2. It follows that each 


Ax41 is a refinement of A;. Thus A; consists of 2”* congruent sub-boxes, and 


vol(Q) 
gnk 


: . b,-a b,—a 
each sub-box o has dimensions so FE am 


vol(c) = and diam(o) = 


(Debi —a;)’)"*. Denote the sub-boxes in the partition A; by 0j,..., 9. As 
before, we form the upper and lower Riemann sums of f corresponding to the 
partition A, and write 
gnk gnk 
Sf) = dif vol(a;), and s,(f) = >) fz,vol(a;). 
i=1 i=1 


Since A;.4, isa refinement of A;, 


Sif) = S,(f) = ..., ands, (f) < (f) <.... 
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As we discussed, the above sequences are bounded; hence 
Ay = lims,(f) and fy = lim S,(f) 


are finite and a < Bp. Alsoa <a<f < Bp. 
For the rest of this section, we assume that fis continuous on Q. 


Theorem 8.1.2. If f is a continuous real-valued function on Q, then f is Riemann 
integrable, and 


Sof (o)dx = lim, s,(f) = lim; S,.(f). 


Proof. In the notation of the previous paragraph, we prove that a = Bo. This will 
establish all the assertions of the theorem. Let € >0, and let k be a positive 
integer such that |S,(f) — Bo| < €/3, and |s,(f) — a | < €/3. Since fis uniformly 


continuous on Q, there exists 6 > 0 such that | f(x) —fly)| <a ae whenever 
\|x—y|| < 6. We may assume, without loss of generality, that the integer k is such 


that the diameter of each sub-box in Ay, is less than 6. Since f assumes its maximum 


and minimum values on @; in o;; |f°' —fg,| < aaa , for each 1 <i<2"*. Now 


nk gnk 
Is) < Din lf Fe, volo) < “St ,vol(g;) =¢/3. Finally, 
1a — Bol S 1% — KP) + 5k) — SP] + SCD — Bol < €/3 + €/3 + €/3 =e. 


Since € is arbitrary, Ay) = By. 


Theorem 8.1.3. If fand g are continuous on Q, then 


[ (f+ 9)dx = [ fdx+ [ gdx. 


Proof S(f+g)= 7 (F+9)% voll). Now (f+9)% = maxyeq, f(x) +90) < 
MAXxeg, f(x) + MAXxeg, B(x) =f + 9%. Therefore, Sf+g) < Sef) + S.(g). 
Taking the limit of both sides as k > 00, fo(f+g)dx < Jof+ Jggdx. Similarly, 
sf + g) = sx(f) + sp(g)s hence f,(f+ g)dx = oft Jogdx. 


Example 5. Let f: R” > C be continuous. If f,|f(x)|dx = 0 for every cube Q, 
then f= 0. 

Suppose, contrary to our assertion, there is a point x) € R” such that m= 
|f(xo)| > 0. By the continuity of f, there is a cube Q centered at x9 such that, 
for x € Q, | f(x) —flxp)| < m/2. Now, for x € Q, m—|f(x)| = |f(%)| — Lf) < 
| flxo) —fQ)| < m/2. Thus | f(x)| > m/2 for all x € Q. Consequently, JQ f(x)dx = 


i mt This contradiction proves the result. @ 
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Lemma 8.1.4. Let f be continuous on Q. Then Jal fax = =J6 dx. 


Proof. Since (—f)°' = maxyeg,(—f(x)) = —minyeg, f(x) = —fo,; 


gnk 
= =]j —f) =i —f/Fi l(c: 
[ (—fldx = limS,(—f) in f)% vol(g;) 
gnk 
= = fe, vol(a;) = —lims(f) =- i; fdx. 
Theorem 8.1.5. If f is a continuous real-valued function on Q, and a€R, then 


Sa(apdx = a fa fax. 


Proof. If a > 0, the proof is simple. Ifa < 0, then 


[ende= foi pac=lal [ pde= tol [ fee=a foe 


It is now easy to verify the linearity of the integral: if fand g are continuous on Q, 
and a,b € R, then f,(af + bg)dx = a fi, fdx + b f. gdx. 


Theorem 8.1.6. Let f and g be continuous real-valued functions on Q. Then 


(a) iff > 0, then {,f = 0; and 
(b) iff< gon Q then fi fdx < Jygdx. 


Proof. Part (a) follows from the definition. 
To prove (b), let h = g—f. Then h > 0; hence, by (a), 
Jghdx > 0, so Jagdx— Ja fdx = Solg—fldx a Jo hdx > 0. 


Definition. Let f be a continuous, complex-valued function on Q, and write 
f=, + if, where f, and f, are continuous real-valued functions. Define 


Jofdx = Sofi tilgfrdx. 
Theorem 8.1.7. For continuous complex-valued functions f and g, and alla,b € C, 


Solaf + bg)dx = a f,fdx + b fygdx. 


‘The proof is purely computational and is left as an exercise. Mi 


Theorems 8.1.6(a) and 8.1.7 are often summarized by the terminology that the 
Riemann integral is a positive linear functional on the space C(Q) of continuous 
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complex-valued functions on Q. The positivity of the integral means that, for f> 0, 
Jafdx 2 0. 


Exercises 


In all the exercises below, we assume that fis a bounded function on a box Q. 


13. 


. Prove that the sum (difference) of two integrable functions f and g is 


integrable, and Soft gdx = Jafdx + Jagdx. 


. Prove that, for any two partitions A, and A, of Q sa, < S42, Hint: Consider 


a common refinement, A, of A, and A,. 


. Let fand gbe integrable on [a, b], and let f < h < g. Give an example to show 


that h need not be integrable. 


. Suppose f is integrable, and f(x) >m> 0 for some constant m and all 


x € [a, b]. Prove that 1/fis integrable on [a, b]. 


. Let a=X) <x, <... <x, = 0b be a partition of the interval [a,b]. For 1 < 


k<n-—1,let E, = [x,_1,x;), and let E,, = [x,,_,,x,]. For constants a),...,d,, 
define s = ys a. Xz,. Such a function is called a step function. Prove that 


Sf. s(x)dxe = Yt ay — Xp-1)- 


. Let f : [a,b] > [0,00) be an integrable function. Prove that So fedex = 


b : : 
suptf, s(x)dx}, where the supremum is taken over all step functions s such 
that s <f- 
In all of the remaining exercises, we assume that fis continuous on Q. 


. (a) Prove theorem 8.1.7. 


(b) Prove that | f,fdx| < fq |f|dx. Prove the statement for complex-valued 
functions f. 


. Let f= 0 be such that f(, fdx = 0. Prove that f= 0. 
. Suppose that the sequence f;, converges to fin C(Q), that is, in the uniform 


norm. Prove that lim, Jo fax = Jo fdx 


. Define the average value of the function f on Q by f;, = =—_ Jo fdx. Prove 


vol(Q) 
that there exists a point € € Q such that f(f) = f,. 


. Let Q=1, X...XI,, let J; be a closed subinterval of I; for 1 <i <n, and let 


Q, =), X...XJ,. Prove that if f> 0 on Q, then Jo, fax < Jo fdx. 


. Let Jj,...,1,, be compact intervals, and let c be an interior point in J, = [a, b]. 


Suppose Q, =[a,c]xI,x...xXI, and that Q, =[c,b]x Lx... XI,. Show 
that if fis continuous on Q; U Q,, then fg Vg, fax = Ja, fax + Sa, fax. 

Fubini’s theorem. Let {J; = [a;,b;],1<i<n} be compact intervals, let 
Q=1,x...x]I,, and let Q’ =I, x... XI. Fora point x = (1, %,...,X,) EQ, 
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we write x = (x,,x’), where x’ = (%,...,x,) € Q’. Prove that Saf dx = 

ie Sy fr, x' dx’ dx,. It follows that fQ fdx can be computed by evaluating 

the iterated integral f” i 7 ie tle ‘ fax,...dx,. 

14. The fundamental theorem of calculus. Let f € Cla, b]. 

(a) For x € [a,b], define F(x) = f° 7 f(bdt. Show that F is differentiable and 
that F’(x) = f(x). 

(b) Show that if Fis differentiable in an open interval containing [a, b] such 
that F’ = fon [a,b], then S. fodx = F(b) — F(a). 


8.2 Measure Spaces 


Let us consider the problem of measuring the volume of objects (sets) in R%. 
Strictly speaking, volume is a function that assigns a nonnegative number to a 
subset of R?. A natural question is whether it is possible to measure the volume 
of an arbitrary subset of R*. For the most natural measure on R*, namely, the 
Lebesgue measure, the answer to the question is no. In other words, there are 
subsets of R? to which a volume cannot be assigned. The question then becomes 
that of finding a large enough collection of R? for which a volume can be assigned. 
Such sets are called measurable. It is clearly desirable for the finite union of 
measurable sets to be measurable. It was a paradigm shift when it was realized 
that a successful formulation of a measure theory necessitates that we allow 
the countable union of measurable sets to be measurable, and this leads to the 
definition of a o-algebra. The definition of a measure as a set function on a 
o-algebra is quite intuitive. This section develops the basics of abstract measure 
theory and measurable functions. The picture continues to evolve and culminates 
in section 8.4 with the construction of the Lebesgue measure. 


For the remainder of this chapter, we use the notation E’ for the complement X — E 
of a subset E ofa set X. 


Definitions. A collection IN of subsets of a nonempty set X is said to be an algebra 
of sets in X if the following two conditions are met: 


(a) if E € M, then E’ € Mt; and 
(b) if E,,E, € M, then E, VUE, € Me. 


An algebra MM is called a o-algebra if it satisfies the additional condition 


(c) if (E,,) isa sequence in M, then UP, E,, € Me. 
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Example 1. For an arbitrary set X, the power set P(X) is a c-algebra in X. @ 


Example 2. Let X be an uncountable set. A subset E of X is called co-countable if 
E' is countable. The collection of countable and co-countable subsets of X is a 
o-algebra. 


Theorem 8.2.1. 

(a) If M is an algebra, then @,X € M. 

(b) If M is an algebra and E,,E, € Mi,then E, NE, € M, and E, — E, € M. It 
follows by induction that an algebra is closed under the formation of finite 
unions and intersections. 

(c) If M is a o-algebra, and E,, € M, then nj E, € M. 


Proof. 
(a) Let E € MN. Then E’ € M; hence X = EVE' € M, and@ =X’ EM. 
(b) Using De Morgan's laws, if E,,E, € M, then E, NE, = (E, VE;)’ € M. Also 
E, —E)=E, NE, EM. 
(c) This follows from De Morgan's law, since N72, E, = (US, E,,)’. 


Theorem 8.2.2. Let © be an arbitrary collection of subsets of a set X. Then there 
exits a (unique) smallest o-algebra M that contains C. 


Proof. It is clear that the intersection of a family of o-algebras is a o-algebra. 
The collection of o-algebras on X containing © is not empty since P(X) is such 
an algebra. Now take N to be the intersection of all the o-algebras in X that 
contain ©. 


Definition. The smallest o-algebra that contains a collection of sets © is called the 
o-algebra generated by G. 


Definition. Let X be a metric (topological) space. The smallest o-algebra in X 
containing the collection of open subsets of X is called the Borel algebra in 
X, and its members are called the Borel subsets of X. The collection of Borel 
sets of X is denoted by B(X). In particular, the c-algebras B(R") are of central 
importance. 


Example 3. The collection © = {(a,b) : a,b € R,a < b} generates B(R). 
Since every member of © is an open set, the o-algebra generated by © is 
contained in B(IR). Now © generates B(IR) because every open subset of R 
is a countable union of members of ©. @ 
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Definition. Let X be a metric (topological) space. The intersection of a countable 
collection of open subsets of X is known as a Gg set. The countable union of 
closed subsets of X is called an F, set. 


It follows from theorem 8.2.1 that B(X) contains all open sets, closed sets, F, sets, 
and Gs sets. 


Definitions. Let 3M be a c-algebra of subsets in X. A positive measure on WM is a 
set function pt : Mt — [0,00] such that 


(a) & # oo, in the sense that “(E) < oo for at least one E € M; and 
(b) if {E,,} is a countable collection of mutually disjoint members of WM, then 


M(UR1E,) = >) HE,)- 


n=1 


The pair (X, 9) is called a measurable space, the members of M are called 
measurable sets, and (X, Mt, W) is called a measure space. If I and yu are 
understood, we loosely say that X is a measure space. 


Property (b) is known as the countable additivity of positive measures. 
If (X) < co, we say that y is a finite positive measure. 


Example 4. the (counting measure). Let X be a nonempty set, and let Wt = P(X). 
Define uw : Wt — R as follows: “(E) = Card(E) if E is finite, and u(E) = co 
otherwise. Then 4 is a measure on P(X). 

Example 5. the (Dirac measure). Let X be a nonempty set, and let Mt = P(X). 
Fix an element x) € X, and define yu : I — R as follows: U(E) = 1 if x9 € E, 
and u(E) = 0 otherwise. Then yz is a measure on P(X). 

Example 6. Let X = N. A subset E of X is at most countable, so we can write E = 
{1,M),...}. Define u(E) = ipa It is easy to see that fz is a measure on P(X). 
Observe that u(X) = 1. 

Theorem 8.2.3. If X is a measure space, then 


(a) The monotonicity of positive measures: if E,F € M and E C F, then 


M(E) < MCF). 
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(b) The countable subadditivity of positive measures: if (E,,) is a sequence in 
M, then 


MUR E,,) < aa M(E,,). 


(c) IfE, CE, ©... is an ascending sequence of subsets in M, then 


MUr=1E,) = lim, u(E,). 
(d) IfE, 2 E, 2... isa descending sequence of subsets in IM and (E,) < oo, then 
MAj=1E,) = lim, u(E,). 
Proof. (a) Since F= EU(F-—E), w(F) = u(E) + u(F-E) > mE). 
(b) Let B, =E,, and, for n>2, let B, =E, —U%2\E;. The sequence {B,} is 
pairwise disjoint, and US,E, =U B,; hence (UR ,E,) = MU 1B,) = 


De HB,) SE ME). 


(c) Let B, = E,, and, for n > 2, let B, = E,, — E,_1. The sequence {B,,} is pairwise 
disjoint, and U',B;=E,. Now g(U% En) = M(USL1B,) = >, _, K(B,) = 
lim, j=, MB) = lim, #(U%)B)) = lim, K(E,). 
(d) The sequence E, —E,, is ascending, and E, —N7,E,, = U7, (E, — E,,). By 
part (c); M(E,) r MNP E,,) = MUS (E; “= E,,)) = lim, M(E, = E,,) = M(E) a 
lim, M(E,,). Hence u(ne,E,,) = lim, u(E,,). i 

Example 7. The condition (E,) < oo in part (d) of the previous theorem cannot 


be omitted. For example, if z is the counting measure on N, and E,, = [n, 00) N 
N, then lim, U(E,,) = 00, while w(N?)E,,) = u(@) = 0. 


Outer Measures 


We now discuss an important general construction which we need in section 8.4 
for the construction of the Lebesgue measure on R”. 


Definition. Let X be a nonempty set. A set function m* : P(X) — [0, co] is called 
an outer measure on X if the following conditions are satisfied: 


(a) ifE C FC X, then m*(E) < m*(F); and 
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(b) for a countable sequence (E,,) of subsets of X, 


m*(US1E,,) < 2a m*(E,,). 


n=1 


Thus an outer measure is a nonnegative set function on P(X) that is monotone and 
countably subadditive. Outer measures have little intrinsic importance. However, 
an outer measure can be restricted to a positive measure on a certain o-algebra of 
sets in X, as we detail below. 


Definition. Let m* be an outer measure on X. A subset E of X is said to be m*- 
measurable (or simply measurable, in this discussion) if 


m*(A) = m*(ANE) +m*(ANE’) 


for all subsets A of X. 
The above condition is known as the Carathéodory condition. Let JN denote 
the set of all m*-measurable subsets of X. 


The Carathéodory condition is not a very intuitive idea. However, it immediately 
guarantees the finite additivity of m* on M. Indeed, if E, and E, are disjoint 
subsets of X, and E, is measurable, then applying the Carathéodory condition with 
A=E, UE), we obtain 


m*(E, UE,) = m*((E, UE») NE,) + m*((E, VE,)NE,) = m*(E,) + m*(E)). 
The Carathéodory condition also implies without too much difficulty that Mt is an 
algebra (see lemma 8.2.4). In fact, it turns out that Mt is a o-algebra and that the 
restriction of m* to M is a positive measure. We prove this in three steps. 


Lemma 8.2.4. MN is an algebra. 


Proof. IfE € M, then E’ € M. This follows from the symmetry of the definition of a 
measurable set. Now let E,,E, € IN. We need to prove that E, U E, is measurable. 
Because m* is subadditive, it is sufficient to show that 


m*(AN(E, VE,)) + m*(AN(E, VE,)’) < m*(A) 
for all subsets A of X. 


Using the identity AN (E, UE,) = (ANE,) U(ANE, NE,) and the measura- 
bility of E, and E), 
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m*((ANE,)U(ANE,NE))) + m*(ANE, NE}) 
<m*(ANE,)+m*(ANE,NE,) +m (ANE, NE;) 
=m*(ANE,)+m*(ANE,)=m*(A). a 
Lemma 8.2.5. If(E,,) is a disjoint sequence of measurable sets and A C X, then 


n 


(a) m*(A NU E;)) = 2), WM (ANE), 
(b) m*(ANUS,E;) = d_, m*(AN E;), and 
(c) m*(Uj2)E;) = 7. m*(E}). 


Proof. Using the fact that E, is measurable, we have 


m*(AN(E, VE,)) = m*(AN(E, UE.) NE,) + m*(AN(E; VE) NE}) 
=m*(ANE,)+m*(ANE)). 


To complete the proof of part (a), we use induction coupled with the fact we just 
established (n = 2) and the fact that M is an algebra. 


To prove (b), 


dm (ANE) = m*(AN ULE) < m*(ANUEE,) 


i=1 


= m*(US,(ANE,)) < ))m*(ANE). 


i=1 


Taking the limit as n > oo, we obtain (b). Part (c) follows from (b) by taking 
A = US) Ej. | 


Theorem 8.2.6 (Carathéodory’s theorem). Jt is a c-algebra, and the restriction 
of m* to M is a positive measure. 


Proof. The fact that m* is countably additive on IN is part (c) of the previous 
theorem. We need to show that MN is closed under the formation of countable 
unions. Let E,, € IN, and write E = UR, E,,. Define B, = E,, and, forn > 2,B, = 
E,, — UZ E;. Since M is an algebra, each B, € M. Notice that the sets B,, are 
mutually disjoint, and U?_,B, = US ,E,. Therefore, without loss of generality, 
we may assume the sets E,, are mutually disjoint. We need to show that, for 
ACX,m*(A) > m*(ANE) + m*(ANE’). Using the facts that U_,E, € MAN 
(UL, E;)’ DANE’, and lemma 8.2.5, we obtain 
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m*(A) = m*(AN (U1 E;)) + m*(AN (UL E;)’) 


> m*(AN(UL)E;)) +m*(ANE’) =) m*(ANE,) +m*(ANE). 


i=] 


Taking the limit as n — oo in the above string, then applying part (b) of the 
previous theorem, we obtain 


m*(A) > >) m*(ANE,) + m*(ANE’) = m*(ANE)+ m*(AnE). 


i=1 


Definition. Let (X, 9,4) be a measure space. We say that yz is a complete 
measure if whenever E € IN is such that w(E) = 0, then any subset of E is in 
M. Thus M contains all subsets of sets of measure 0. 


We have now reached the culmination of this construction. 


Theorem 8.2.7. Let m* be an outer measure on a set X, and let M be the o-algebra 
of measurable subsets of X. Then the restriction of m* to MN is a complete measure. 


Proof. We have already established the fact that m* is a measure on IN. It remains 
to show the completeness of m*. We first show that if ZC X and m*(Z) = 0, then 
ZEM. Let AC X. Then 0< m*(ANZ) < m*(Z) =0. Thus m*(A) < m*(An 
Z)+m*(ANZ’) = m*(ANZ’) < m*(A). This proves that Z is measurable. Now if 
ECZ, then0 < m*(E) < m*(Z) = 0; hence m*(E) = 0. By what we have already 
established, Ec M. Hl 


A word about complete measures is very much in order here. It is an inconvenient 
fact that incomplete measures can occur quite naturally. For example, the product 
of Lebesgue measures, which are complete, is not a complete measure (see section 
8.8). It is desirable to know whether an incomplete measure space can be com- 
pleted. The answer is yes, and the completion of a measure turns out to be a rather 
simple construction. See problems 3 and 4 at the end of this section. 


Measurable Functions 


For the remainder of this section, (X, 9%) is a measurable space. We allow real- 
valued functions on X to take infinite values. This is essential because, for example, 
the limit of a sequence of functions f(x) may well diverge to +00, or it may not 
even exist for some x € X. It will turn out that this is largely a technicality because, 
in practice, the exceptional set of points where a reasonable measurable function 
f takes infinite values has measure 0 (see, e.g., example 1 in section 8.3). In this 
section, we have to contend with the nuisance that functions can assume infinite 
values. 
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Definition. An extended real-valued function f : X > R is said to be measurable 
if f-'((a, co]) is measurable for every a € R. 


Proposition 8.2.8. For a function f:X—R, and a€R, the following are 
equivalent: 


(a) f is measurable. 

(b) f-*([a, co]) € M. 
(c) f-'([-00, a)) € M. 
(d) f-([-ov, a]) € M. 


Proof. (a) implies (b): f-'([a, 0]) = N&f7'((a— 1/n, o0]). 
(b) implies (c): f-'([—00, a)) = X — f~'([a, ]). 
(c) implies (d): f~\([—00, a]) = N&yf~'([—00, a + 1/n)). 
(d) implies (a): f~'((a, 0]) = X—f~'([—-00,a]). 


Theorem 8.2.9. (a) A constant function is measurable. 
(b) IfA CX, then x, is measurable if and only if A is measurable. 
(c) If f : X > R is measurable and c € R, then f+ c and cf are measurable. 


The proof is left as an exercise. 


Lemma 8.2.10. Let f and g be measurable, extended real-valued functions. Then 
the following subsets of X are measurable: 


A={xEX: f(x) > gx}. 
B={xEX: f(x) = g(x}. 
C={x EX: f(x) = g(x}. 


Proof. The set 
A= Urea|{x EX: f(x) >rN{xEX : gx) < 73 


is measurable because Q is countable. 
‘The set B is measurable because it is the complement of the set 


{x EX: g(x) > f(x}, 


which is measurable by part (a). 
Finally, 


C={xEX: f= QwiNwEeX : g(x) > f(x} 
is measurable by part (b). 
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Proposition 8.2.11. An extended real-valued function f is measurable if and only if 
the following two conditions hold: 


(a) f-'(—00) and f~'(00) are measurable subsets of X, and 
(b) f-'(V) is measurable for every open subset V of R. 


Proof. Suppose fis measurable. Because f—'(co) = N&,f~'((n, c0]), and f-'(—00) = 
Nf ([—00, —n)), f- (00) and f~!(—00) are measurable. Since an open subset 
of IR is a countable union of open bounded intervals, it is enough to show that the 
inverse image under f of an open bounded interval is in M. But, for a bounded 
interval (a,b), f-'((a, b)) = f-'((a, 0]) Nf~'([—00, b)), which is in M. 

Conversely, since (a,oo) is open and f—'((a,co]) =f '((a,00)) Uf (co), 
f-\((a,00]) isin M. H 


Lemma 8.2.12. Let f : X > R be a measurable function, and let p : RR be 
continuous. Then the function h : X > R defined below is measurable: 


As) = ila iff@eER, 
0 if |F(2)| = 00. 


Proof. We use proposition 8.2.11. By construction, h takes only finite values, so 
h7'(co) = @ =h7'(—oo). By the continuity of , if V is an open subset of R, 
then U = g7!(V) is open in R. By proposition 8.2.11, f-'(U) is measurable. Now 


f"(U) if0 € V, 


h-\(V) = 
ff (U)uUf (oo)Uf(-00) ifde V. 


In either case, h~'(V) is measurable. Again by proposition 8.2.11, h is a measur- 
able function. @ 


This lemma can be applied to infer the measurability of a wide class of functions. 
The following is a sample. 


Theorem 8.2.13. If f : X > R is measurable, then so are max{f,0}, min{f,0}, |f|? 
for all positive p, and f” formeEN. 


Proof. This follows from lemma 8.2.12 applied with p(t) = max{t,0}, e() = 
mintt, 0}, p(t) = |t\?, and p(t)=t”, respectively. Here it is assumed, in 
accordance with the lemma, that when |f(x)| = co, (vof)(x) is defined to be 0. Hl 
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Lemma 8.2.14. Let A be a measurable subset of X, and let f: A > R be such that 
{xEA: f(x) > a} eM for every aE R. Define h : X > R as follows: h|, =f 
and h(X — A) = 0. Then h is measurable. 


Proof. Leta € R. Ifa > 0, then 
h7\((a,oo]) = {x EA: h(x) > at}U{x EX-A: h(x) > as={xEA: f(x) > ah, 
which is measurable. 
Ifa <0, 
h7\((a,co]) = {x EA: f(x) > a}U(X—A), 


which is also measurable. @ 


A function satisfying the conditions of lemma 8.2.14 is said to be measurable on A. 
Loosely speaking, lemma 8.2.14 says that a measurable function on a measurable 
subset of X can be extended to a measurable function on X. Another way to look 
at it is that altering the values of a measurable function on a measurable subset 
produces a measurable function. Assigning the value 0 to h(X — A) is arbitrary, 
and any (extended) real number can be used instead of 0. 


Theorem 8.2.15. Let f and g be measurable, extended real-valued functions on a 
measurable space X, and let 


A={xEX: f(x) ERINWEX : g(x) ER}. 


Then the following functions are measurable: 


ie) = i +9(x) ifxeA, 
0 ifx € A, 
Ke) = aie ifxeE A, 
0 ifx € A. 


Proof. By proposition 8.2.11, the set A is measurable. By lemma 8.2.14, it is enough 
to check that h and k are measurable on A. Now ifa ER, {x EA : f(x) + g(x) > 
atj=AnN{xEX: f(x) > a—g(x)}, which is measurable by lemma 8.2.10. Thus h 
is measurable on A. Now that f+ g and f —g are measurable on A, (f+ g)? and 
(f—g)* are measurable on A by theorem 8.2.13. It follows that k is measurable on 


A because fg = [(f + g)’ — (f—g)"]/4. 
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Theorem 8.2.16. Let f,, be a sequence of measurable functions. Then the following 
functions are measurable: 


(a) SUP afar 

(b) inf fn 

(c) limsup, f,, and 
(d) lim inf, f,,. 


Also, the set {x € X : lim,,f,,(x) exists} is measurable. 
Proof. Parts (a) and (b) are true because 
{x EX : supyfa(x) > a} = Ui {x EX : f(x) > af, 
and 


{xEX : inf, fa(x) < a} = UR {x EX : f(x) < ah, 


respectively. Now parts (c) and (d) follow from parts (a) and (b) because 


limsupf, = inf,{supisnt;}, and liminff, = sup, tinfiont}- 


The last assertion follows from parts (c) and (d) and from lemma 8.2.10, because 
the set in question is 


{x €X : limsupf,(x) = liminff,(«)}. 


Definition. A complex function f : X — C is said to be measurable if its real and 


imaginary parts are measurable. 


Theorem 8.2.17. A complex function f: XC is measurable if and only if 
f-'(V) € M for every open subset V of the complex plane. 


Proof. Write f=f,+ if, and suppose f is measurable. An open set V in C is a 
countable union of open bounded rectangles, so it is enough to show that, for 
the rectangle R = (a,b) x (c,d), f-'(R) is measurable. But this is obvious since 
f-(R) =f, '(ab)) Nf; (c,d). To prove the converse, let a € R, and consider 
the open set V={E+inEC: € >a}. By assumption, f '(V) is in M. But 
f CV) =f/'(a, oo)). One shows that f, is measurable by considering the open 
set V={E+ineC:n>al. fl 
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Excursion: The Hopf Extension Theorem? 


The motivation for the Hopf extension included below is not entirely precise, but 
we hope it will help the reader gain some insight into the construction of important 
measures such as the Lebesgue measure on R?. The plane contains a collection of 
subsets for which a natural measure exists, namely, the collection of rectangles.’ 
The measure (area) of a rectangle ought to be the product of its dimensions. The 
collection © of finite disjoint unions of rectangles in the plane is known to be an 
algebra in R’, and the measure of a member of @ is defined in the obvious way: it 
is the (finite) sum of the measures of the rectangles in the union. The immediate 
question is whether the natural measure we just described can be extended to the 
a-algebra IN generated by GC. 

The Hopf extension abstracts the above motivation and provides an affirmative 
answer (theorem 8.2.19). Theorem 8.2.20 gives a sufficient condition for the 
uniqueness of such an extension. 


We will construct measures on product spaces (section 8.8) using a different 
approach, and this excursion can be bypassed without affecting the continuity of 
the rest of this chapter. 


We will adopt the following standing assumptions throughout this excursion: 


1. © is an algebra of subsets in X. (Thus X € ©, and @ € G.) 

2. uw: © > [0,00] is a set function such that u(@) = 0. 

3. “is a countably additive on © in the sense that if {C,,} is a disjoint sequence 
in ©, and U%_,C,, € G, then (U2, C,,) = ye). Observe that such a 
function is monotone. 


Lemma 8.2.18. Under the standing assumptions, define a set function 
n* : P(X) > [0, co] by 


n*(E)= inf HC»): C, EC EC UEC 
n=1 


(a) n* is an outer measure on X. 
(b) For every E€ ©, n*(E) = w(E). 
(c) Every E € © is n*-measurable. 


> This theorem is also attributed to Lebesgue. 
> Intuitively, a rectangle is the product of two intervals. More precisely, a rectangle is the product of 
two Lebesgue measurable subsets of R. 
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Proof. The definition of n* is meaningful because X € ©, and the monotonicity 
of n* is obvious. To prove that n* is subadditive, let {E,} be a countable 
collection of subsets of X, and, without loss of generality, assume that 
yy n*(E,,) < 00. Lete > 0 and, for eachn EN, let {C,, ;} C © be such that E,, C 
Uj21C,, and Bee) <n*(E,)+€/2". Now UnyE, © Upin1Cy,j3 hence 


foe} 


n* (US, E,) < ee DG a) < ae n*(E,) +€/2" = Dy n*(E,) +¢. 
Since € is arbitrary, the proof of part (a) is complete. 


(b) If EEG, then ECURAC,, where C; =E, and C,=@ for n>2. Thus 
n*(E) < MCE). Suppose EC UR_|C,, and define D, = C, and, for n> 2, Dy = 
C,, — UE] C;. Because © is an algebra, each D,, € ©. Clearly, USD, = US1C,; 
hence E = US, (EN D,,). By the additivity of 4 on ©, WCE) = w(UP | (EN D,,)) = 
D, HENC,) < D, M(C,,). By the very definition of n*, u(E) < n*(E). 


(c) Let EE ©, A CX, and, without loss of generality, assume that n*(A) < oo. 
For every € > 0, there exists a sequence {C,,} in © such that AC UR_,C, and 
ae M(C,,) < n“(A)+ é. By the additivity of u on ©, n*(A)+€ > wa K(C,,) = 
ye ECCr NE) + D(C, NE’) > n*(ANE)+v*(ANE’). Since € is arbi- 
trary, the result follows. 


Theorem 8.2.19 (the Hopf extension theorem). Under the standing assumptions, 
the set function ft has an extension to a positive measure on the a-algebra M 
generated by G. 


Proof. By theorem 8.2.6 (Carathéodory’s theorem), the collection N of n*-measurable 
subsets of X is a o-algebra, and the restriction, v, of n* to N is a positive measure. 
Since every member of © is n*-measurable, MN C MN. The measure space we seek 
is (X, N,v). Hl 


The next corollary establishes a sufficient condition for the uniqueness of the Hopf 
extension. 


Theorem 8.2.20. Suppose, in addition to the standing assumptions, that the 
following assumption is satisfied: 


o-finiteness: there exists a sequence (X,,) in © such that X = UPR, X,,, and, for 
everyn EN, U(X,,) < 00 


‘Then the extension v provided by the previous theorem is unique. 


Proof. The following two facts are consequences of the o-finiteness assumption. 
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(a) The sequence (X,,) may be assumed to be mutually disjoint, because we can 
replace it with the sequence Y, = X,, and, for n > 2, Y, = X, —Us2] Xj. Clearly, 
Y, €G, u(Y,,) < U(X,) < 00, and UR, Y, =X. 


(b) An arbitrary member E € M can be written as E = U7 E,, where (E,,) is a 
disjoint sequence in MN such that v(E,,) < co. We simply set E, = EN Y,,. Then 
V(En) S VY) = Un) < 00. 


We now prove the result. Suppose there is another measure that extends fu from © 

to IN. We continue to use the symbol yt to denote this extension. Thus we assume 

that u(C) = v(C) for every C € © and prove that U(E) = v(E) for every EE M. 
Observe the following facts: 


(c) If{C,,} € ©, and C = UR, C,, then v(C) = lim, v(ViL, C;) = lim, WU, C)) = 
H(C). 


(d) For every EEM, mE) <v(E). If EC UR C,, where {C,} CC, then 
M(E) < M(US1C,) < 1, M(C,). By the definition of v, u(E) < v(E). 


(e) If EE M and v(E)<oo, then (ELE) =v). Let €>0. There exists a 
sequence (C,) in © such that EC C=U%_,C, and pee (oe) < v(E) +6. 
Now v(C) < ie wW(C,,) = pyaar (es) <v(E)+¢. In particular, v(C — E) < 
€. Using fact (c), we have v(E) < v(C) = u(C) = w(E) + W(C— E) < w(E) + 
v(C—E) < u(E) +. Since € is arbitrary, v(E) < u(E). Now M(E) = v(E) by 
fact (d). 


Finally, for an arbitrary set E € IN, we use fact (b) to write E = URE, where 
(E,,) is a disjoint sequence in M, such that v(E,,) < co. Using fact (e), v(E) = 
Det Mn) = Lins HEn) = HE). 


Exercises 


1. Let A = {Aj,...,A,,} be distinct subsets of a nonempty set X. Show that the 
g-algebra generated by A contains at most 27" members. 

2. Show that if a c-algebra M is infinite, then it is uncountable. 

3. Completion of an incomplete measure. Let (X, Wi, uw) be an incomplete 
measure space, and let 3 be the collection of subsets of sets of 4-measure 
0. Let M be the smallest o-algebra in X that contains MU 3. Prove that 
every member of M has the form EU Z, where E € M and Z € 3. Extend 
the definition of 4 to M as follows: for EE M and ZE 3, W(EU Z) = w(E). 


10. 


11. 
12. 
13. 


14. 


15. 


16. 


17. 
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Show that is well defined and that jz is a complete measure. Hint: Show 
that the set Mt, ={EUZ : EE M,Z € ZB} is a o-algebra. 


. This exercise provides a useful alternative characterization of the comple- 


tion of a measure space. In the notation of the previous exercise, prove that, 
for a subset E of X, E € MN if and only if there exists two sets A and Bin M 
such that A C E C Band pp(B— A) = 0. 


. Prove that each of the following collections of sets generates B(R): 


(a) {(a,00): aE R} 
(b) {(-00,b) : bE R} 


. Prove that the collection of open boxes TL, Gi, b;) : a;,b; € Q} generates 
B(R"). 
. Suppose J is a o-algebra generated by a collection © of subsets of a 


nonempty set X. Prove that J is the union of the o-algebras generated 
by & where % ranges over all the countable subsets of ©. Hint: The latter 
union is a o-algebra. 


. Prove that if Eand F are measurable sets such that w(EAF) = 0, then u(E) = 


M(F) = MEU F) = WEN F). 


. Let E,, be a sequence of measurable sets such that a H(E,,) < oo. Prove 


that the set N71 Us, E; has measure 0. Conclude that, except for a set of 
measure 0, every x € X belongs to finitely many of the sets E,,. 

Let Ej,...,E, be measurable sets and, for 1 <j <n, let F; to be the set 
of points in X that belong to exactly j of the sets Ej,...,E,. Prove that 
HUE E:) = Yj, MCE)» and Dy; M(E;) = Dy; je). Hint: Fj = fee X: 
yo Xe, (x) = jf. 

Prove theorem 8.2.9. 

Show that if fis measurable and a € R, then f-'(a@) is measurable. 

Let (X, Mt) be a measurable space such that Wt # P(X). Prove that there is 
a function f such that | f| is measurable but fis not. 

Suppose that (X, Jt) is a measurable space and that Y is a nonempty set. 
Show that if f : X > Y, then the collection N ={EC Y: f- (LE) € Mp isa 
o-algebra. 

Let (X, 90) be a measurable space, and let f: X > R be a measurable 
function. Show that f~!(B) is measurable for every Borel subset B of R. 
Hint: The collection Q = {E CR : f~'(E) € M} contains all open subsets 
of R. 

Let X be a topological space, and let f : X > R be a continuous function. 
Show that f~!(B) is a Borel subset of X for every Borel subset B of R. 
Show that if E € B(R‘) and F € B(R’), then Ex F € B(R"™S). Hint: For 
an open subset E of R’, consider the collection Q={FCR®: ExXFe 
B(R'tS)}. Show that B(R‘) C Q. Then, for a Borel subset F of R‘, consider 
the collection {E CR’ : Ex FE B(R't)}. 
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18. Let C be the Cantor set, and define a function f : [0,1] > C as follows: 
f(0) = 0, and, for x € (0, 1], write x = panile a and set f(x) = an a Show 
that fis Borel measurable. Hint: For a fixed i EN, define fi(x) = a;. It is 
enough to show that f; is measurable. To this end, show that f; = >) { XE, * 


k =1,3,5,...,2'— 1}, where E,, = (5, =|. 


8.3 Abstract Integration 


In this section, we examine Lebesgue’s revolutionary approach to the definition of 
the integral. The motivation below is imprecise and does not rigorously develop 
any particular set of ideas. For the sake of simplicity, we assume that fis a positive 
continuous function on a compact interval. 

The Riemann integral is based on the simple geometrical idea of dividing the 
region below the graph of finto thin vertical strips, where the area below the graph 
is approximated by the integral of a step function (the Riemann sum). Lebesgue’s 
idea was to divide the range of fby points yo,...,y,, and, for k = 1,...,n, we consider 
the sets E, = f~'([yp,.¥41))- Even for an uncomplicated function, the set E, may 
come in several fragments, as shown in figure 8.1, where E, has three fragments. 
When y;...; — yj, is small, the approximate combined area of the three shaded strips 
is the approximate common height, y;, times the sum of the lengths of the three 
fragments that comprise the set E, or, more precisely, the measure of E,. Thus 


Yie+1 


Yk 


Ex Ex Ex 


Figure 8.1 Lebesgue integration 


* We use the series representation of x if x has a terminating binary expansion. 
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the approximate area below the graph is Sia y.@(E;), which, by definition, is 
the integral of a simple function. Needless to say, as the partition of the range 
of f gets finer, we expect the integrals of the simple functions to converge to the 
integral of f. This is the overarching idea in Lebesgue integration. As it turns out, 
we can integrate far more functions under the Lebesgue definition than under 
the Riemann definition. For example, the integral of any positive measurable 
function is defined, although it may not be finite. Additionally, the definition of the 
integral extends seamlessly to abstract measure spaces. The section results capture 
the above ideas. First we define the integral of a positive measurable function 
f, then we show that f is the limit of simple functions, s,, and then we show 
that f, fdu =lim, f,s,du. Extending the definition of the integral to complex 
functions follows without difficulty. The section concludes with three important 
convergence theorems. 


Definition. Let (X, 9%) be a measurable space. A simple function on X is a 
function s : X > C of finite range. If a,,...,a,, are the distinct values of s, then 
s= ye 4:Xz,, where E; = s~'(a;). Clearly, E; NE; = © if i #j. 


Remarks. (a) It is clear that a simple function is measurable if and only if each 
E; is a measurable set. Also, a simple function need not have bounded support. 
For example, s = ¥(~.0,0) + X(1,2) is not supported on a bounded set. 


(b) Our definition of a simple function is sometimes referred to as the standard 
form of a simple function. It is important to understand that any finite linear 
combination of characteristic functions of disjoint sets is a simple function. For 
example, if s = Dy dix, Where a),...,a,, are not all distinct, we can rewrite 
s in standard form as follows: Let b,,...,b, be the distinct values of s and, for 
1<i<n,letT;=GEN,, : a, = bj}. Note that {T),..., T,,} is a partition of N,,. 
Set B; = Ujer,E;. The sets B; are disjoint since the sets E; are. Clearly, s = 
a "iXe,- It is, in fact, true that a finite linear combination of characteristic 
functions of subsets of X is a simple function, even when the sets E; overlap. An 
inductive proof is possible. We invite the reader to try it. 


Definition. The integral of a simple function. Let (X, I, 4) be a measure space, 
and let s= pee dip, be a measurable simple function in standard form. The 
integral of s with respect to yu is 


i sdu = >) au(E;). 
xX i=1 


The above formula is robust in the sense that it is valid even when s is not 
in standard form. This follows from remark (b) above. If a,...,a,, are not 
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all distinct, write s in standard form using remark (b): s = we "iXp,- Then 


Din OCB) = Dip 81 Dyer, HE) = Dp HE). 


Theorem 8.3.1. If s and t are simple functions and c € K, then 


[orou 7 [stu [ran and f ex = c sa 
x Xx x x Xx 


Proof. The second identity follows trivially from the definition. 
Let s= pe AiXg, and t= sae biXr, be simple functions in standard form, and 
let By = E;NF),1 <i<m,1 <j <n. The collection {Bj} is disjoint, E; = Uj By, 
and F,=Uj2,By. Now s= par Dini Hay t= pe ys biXB,> and s+t= 
ye: Da Gi + bi) Xp. By definition, 


[ (s+ Adu = TVG; + b)uB,) 
x 


i=1j=1 


a >; a; dy HB;y) + > b; HB) 
i=l j=l 


j=l i=l 


=D) au(E,) + >) bu(F) = ih sdu+ i tdu. 
i=1 x xX 


j=l 


Remark. This proof includes a proof of the fact that the sum of two simple 
functions is a simple function. 


Definition. The integral of a positive function. Let f : X — [0,00] be a measur- 
able function. We define 


[it = sup| [ sd :0<s<fis simple}. 
Xx xX 


Observe that this definition is reminiscent of the fact that the Riemann integral 
of a function is the supremum of the lower Riemann sums of the function 
and that a lower Riemann sum of a function is the integral of a step function 
dominated by f. 


The following facts are immediately obvious: 


(a) fyofdu=c fy fdu for c > 0, and 
(b) If g : X > [0,00] is measurable and0 <f<g, then fy fdu < fy gdu. 
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The fact that, for positive functions fand g, {,(f+ g)du = f, fdu+t J gdu requires 
the development of some machinery. First we show that a positive measurable 
function f is the limit of a sequence, s,, of simple functions, then we show that 
lim, Jy s,du = Jy fdu. The details appear below. 


Definition. The positive and negative parts of a measurable real-valued func- 
tion f are, respectively, 


ft (x) = maxff(x), 0}, and f(x) = —minf{f(x), 0}. 
Observe that ft and f are positive, measurable functions, 
f=fi—-f,and|fl=fP +f. 


Theorem 8.3.2. (a) Let f : X > [0,00] be a measurable function. Then there exists 
an increasing sequence of simple functions s,,s5,... such that lim, s,(x) = f(x) 
for everyx EX. 
(b) Let f: X +C be a measurable function. Then there exists a sequence of 
simple functions u,,Up,... such that lim, u,(x) = f(x) for every x € X and 
[ui] < lua] << If 


Proof. For each nEN, define E,,={xEX: — < fl) < 3} k=1,2,...,n2", 


n2” k— i 
and F, ={x EX : f(x) =n}. Lets, =), Mina + AXE, 
The fact that s,(x) < f(x) is clear. Now every x € X belongs to exactly one of 
the sets E,,, or a Fr We show that s, is an increasing een MES If 


— < f(x y<= —,, then s,(x) = < 7 = Sn41@). if <f(x <2 Sar? then 
ae te) < a = sii). if) > n,n =s,(x) < $41 (0). ae we Dei 


that lim 0) = f(x). If f(x) < «, 0 < f(x) —s,(x) < 1/2". If f(x) = 0, s,(x) = 
n for alln EN. In either case, lim, s,(x) = f(x). 


To prove (b), write f= (ff —f,)+i(ff —f[). By part (a), there are sequences 
of positive, increasing, measurable simple functions s,s, ,t*, and t, such that 
lim, sf =f}, lim,s; =f,, lim, tf =f}, and lim,,t, =f; . The sequence of sim- 
ple functions u, = (st —s;,) + i(t} — t;,) satisfies the requirements of part (b). 


Remark. This proof shows that if f is a bounded, positive, measurable function, 
then s,, converges uniformly to f because ||s,, — fl]. < 1/2”. 
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Lemma 8.3.3. Let f,, : X > [0,00] be an increasing sequence of measurable func- 
tions such that lim, f,(x) = f(x) for all x € X. If s is a simple measurable function 
such that0<s <f, then lim, Jy fidu = Sy sdu. 


Proof. Let s= pie Xp, Fix 0 <a <1, and define B, ={x EX : f,(x) > as(x)}. 
It is easy to see that B, CB,4, and that U;~)B,, = X. Notice that (E;B,) is 
an ascending sequence of sets, and E, = Uy, (E; NB,,); hence u(E;) = lim, ME; N 
B,). Now fy fnrde = Sy fnxe, 4k = & Sy Sp, de = co yet ajM(E; 1 B,,). Taking the 
limit as n > ov, we obtain lim, fy f,du = 2 Sie ajM(E;) = a fy sdu. The result 
we need follows by letting a > 1. 


Theorem 8.3.4. Let f > 0 be a measurable function, and let 0 <s,, <s,41 be simple 
functions such that s,, < f, and lim, s,,(x) = f(x). Then fy fdu = lim, fy s,du. 


Proof. Since 0<s,<fo SySn < Sy fdp, and lim, fys,du< fy fdu. Now if t is a 
simple function such that 0 <t<f, then, by lemma 8.3.3, f,tdu < lim, Jy s,du. 
Therefore, J fdu = suptf, tdu : 0<t<f,t simple} <lim, f,s,du. @ 


Theorem 8.3.5. If f20 and g>0 are measurable functions and a,b > 0, then 
Saf + bg)du = a fy faut b fy gd. 


Proof. By theorem 8.3.2, there exist sequences of simple functions s, < s) <..., and 
t, <t) <... such that lim, s,(x) = f(x), and lim, t,,(x) = g(x). By theorem 8.3.1, 
S(as, + bt, )du =a fys,dutb fyt,du. Now the sequence of simple functions 
as,+bt, is increasing, and lim,(as,(x)+ bt,(x)) = af(x) + be(x). Thus, 
by theorem 8.3.4, f,(aft bg)du =lim, f,(as, + bt,)du =a lim, fys,dut 
b lim, fy tndu =a fy fdutb f,gdu. 


Definition. The integral ofa real function. Let f : X > [—c0, co] bea measurable 
function, and write f = ft — f~. By definition, 


[r= [r- [rae 


provided that at least one of the integrals on the right-hand side of the definition 
is finite. We say f is integrable if both f, ft du and J, f du are finite, which 
is equivalent to the condition that f,,|f|du < oo. This is because |f| = ft +f”, 


f  sl|flandf <|fl- 


Theorem 8.3.6. If f and g are real and integrable, then f(f+ g)du = fy fdut 
Sy gd. Also, fy afdu = a fy fdu for every real number a. 
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Proof. Write f=ft—f,g=gt—g', and let h=f+g. Writing h=h* —h- 
yields h+ +f" +g° =h” +f* +g. Integrating both sides of the last identity 
and using the previous theorem, we obtain fyh*dut+f,f dut+ fg du= 
Sh dut fy fidut fy gt du. The result we seek is obtained by rearranging 
the last identity. To complete the proof of the theorem, we only need to show 
that f(—f)du = — f,fdu. This is a simple calculation if we use the identities 


(ft =f-,and(-fy =f*. 1 


Definition. The integral of a complex function. If f: X > C, and f=f,+ if, 
define fy fdu= fy fidu+ifyfrdu. We say that f is integrable if f, and f, are 
integrable. 


Notice that a complex function is integrable if and only if f, |f|du < oo. This is 


because |f| < If], lhl S fl, and Ifl s IA +lAl- 


Theorem 8.3.7. If f and g are integrable complex functions, and a,b EC, then 
af + bg is integrable and 


for bain =a f payer [ gay 
x x x 


Proof. Jiclaf-+ bgldue < fy lallfl + [bllgldy = lal cl fldue+ Ibl Si lgldse < 00. Thus 
af + bg is integrable. The verification that f(f+g)du= fy fdu+t fy gd is a 
routine calculation, as is the fact that f\ cfd = c J, fd when c is a real constant. 
It now suffices to show that J, ifdu =i fy fd. Indeed, fy ifdu = fy i(f, + if,)du = 
I(t fide = fy fide tify fidu =ifyfdu. 2 


It is easy to see that the set of complex integrable functions is a vector space. We 
denote it by 2'(4). In fact, if a norm is defined on 2'() by || fll, = Jy [fldu, then 
(2) is a normed linear space, as the reader can easily verify. 


Definition. Let (X, WN, 4) be a measure space, and let P(x) be a property that may 
or may not be satisfied by a point x € X. For example, for a given extended real- 
valued function f, P(x) may be the property that f(x) is finite. Another example 
is the property that f(x) = g(x) for two measurable functions f and g. We say 
that property P holds for almost every x in a measurable set E, or that P holds 
almost everywhere in E, if u({x € E : P(x) is false}) = 0. In this situation, we 
often write “P holds for a.e. x € E.” The examples below are good illustrations of 
the concept. 


Example 1. If fe 2'(), then fis finite almost everywhere. 
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Let E, ={xEX: |f(x)| =n}. It is clear that E; DE, 2£E;2..., and that, 
for each nEN, fy |fldu = Kl flxe,4u = Sy nxe, du = ne(E,). Thus M(E,,) < 
* f,.|fldue > Oasn > 00. Therefore (NS, E,,) = lim, u(E,,) = 0. But, E,, = 
bx € X : |f(x)| = oo}. This shows that fis finite almost everywhere. @ 


Example 2. If fe 2(), then | f, fdul < J, [fldu. 
Assume that z= f,fdu #0 because, otherwise, there is nothing to prove. 


Let a= =, Then |a|=1, and az = |z|. Now | fy fdul = |z| =az= af, fdu = 
Sle pdu. It follows that f,afdu is real and positive; therefore f, afdu = 
JSudu, where u=Re(af). Now u<|afl=|f|, and fyudu< fy |af[du= 
Sxl flay. © 


Definition. If f : X — [0, co] is measurable and E € MN, we define 


[ fdu= i Axed. 


Note that if E, CE, then f, fdu < f, fdu. Also, if O<f<g, then f,fdu < 
Spay. 


When s = pe AiXp, isa simple function, then sz = pe 4: Xe ne is also a simple 
function and 


i sd = Y\ a,u(E,N E). 
E j=l 


This equation can very well be used to define f,,sdu. One can then take the 
alternative approach of defining 


[iw = sup| [ste :0<s<fis simplel 
E E 


The two methods of defining J, fdu are clearly equivalent, and the interested 
reader is encouraged to work out the details of reconciling the two definitions. 


Another detail must be mentioned here. If E € M, one can restrict M and pz to E 
in the obvious way: Define IN, to be the members of M contained in E, and define 
Hg to be the restriction of 4 to Mz. This clearly turns (E, Mz, Uz) into a measure 
space, and it makes sense to define J, fdu to be the integral of f|; with respect to 
(E, Mz, My). Again, this definition is consistent with the above definitions of J, fdu, 
and, again, we leave the details to the interested reader. 


Example 3. Suppose f : X > [0,00] is a measurable function. If f,fdu =0 for 
some measurable set E, then f= 0 a.e. on E. In particular, if f, fdu = 0, then 
f=0ae.onX. 


INTEGRATION THEORY 371 


Let E,={x€E: f(x)>-}. Then “p(E,)< Jf, fau< f,fdu=0. Thus 
H(E,,) =0. The result now follows from the fact that {xe E: f(x) >0b= 
UE, and u(U%1£,) < Dy, Hn) = 0. @ 


Example 4. If fis a measurable function and J, fdu = 0, for every measurable set 
E, then f= 0a.e. 
Without loss of generality, assume f is real. Let E={x €X : f(x) = 0}. By 
assumption, f;,fdu = 0. But f, fdu = f, ft du. By example 3, ft =0 a.e. on X. 
Similarly, 77 =0ae.onX. 


Convergence Theorems 


Theorem 8.3.8 (Fatou’s theorem). Let f,, : X — [0,co] be a sequence of measur- 
able functions. Then 


[riminthay <timint f fa 
eet n dx 


Proof. Let g, = infisnfy. Then 0 <g, <% <..., and let f(x) = lim, g,(x). Note that 
f(x) = liminf, f(x). Ifs is a simple function such that 0 <s <f, then, by lemma 
8.3.3, fy sdu < lim, fy g,du. Hence fy fdu = supl fy sdu : s < f} <lim, fy g,du. 
Since 8n < far Sy nde < Se fndu, and lim, fy g,du < liminf, fy f,du. 


Example 5. Let (f,,) be a convergent sequence in 2'(j), and let fbe its 2’-limit. 
Then (f,,) contains a subsequence that converges to f for almost every x € X. 


Choose a subsequence (f,,,) of (f,,) such that, for i EN, ||f,, —fll: <27'. Let 


k ; 
8 = Di-|fn, fl. The functions g, are in 2" and, by construction, 0 < g, < 


& S.., and ||ge|l]; <1. Let g(x) = lim, g(x). By Fatou’s theorem, fy gdu < 
liminf, ||g;||; <1. This shows that ge &'.° Since g(x) = et fn) — fF, 
it follows that the series Dy \fn,(x) —f(x)| is convergent for a.e. x E X (by 
example 1). In particular, lim;_,., |f,,(x) —f()| = 0 for ae.x EX. > 


Theorem 8.3.9 (the monotone convergence theorem). If f,, : X — [0,00] is an 
increasing sequence of measurable functions such that f(x) = lim, f,(x) exists for 


every x € X, then 
[stu =tim [fade 
xX " JX 


° One can also use the monotone convergence theorem to show that g€ &'. 
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Proof. Since f, is increasing, liminf,,f, =f, and since f,f,du is increasing, 
liminf, {fr =lim, {yfidu. By Fatou’s theorem, fy fdu <lim, fyfidu. Since 
fn <A Sindee < Sy fdp, and lim, fy fidu < fy fdu, and the proof is complete. 


Example 6. Let f € &', and suppose that X = U%_,E,,, where E,, is an ascending 
sequence of measurable sets. The lim, f, |fldu = J |fldu. 

For nEN define f, =|f|Xp,- It is clear that f,, is increasing and that 

lim, f,,(x) = |fl(x). By the monotone convergence theorem, lim, Je, [fldu = 


lim, Sefudee = Sil fidu. 


Theorem 8.3.10 (the dominated convergence theorem). Let f,, be a sequence of 
complex measurable functions, and let g € 2'(4) be such that | f,(x)| < |g(x)|- If 
f(x) = lim, f(x) exists for every x € X, then 


fe Wu) and tim [ —f\du = 0, that is ,f,, > fin 2'(u). 
" Ix 


Proof. Notice that |f,,(x)| <|g(x)| implies that |f(x)| < |g(x)|. Hence f, € 2'(u), 
and fe Q'(u). Since |f,-—f|<2g, we can apply Fatou’s theorem to the 
sequence 2g — |f,, — f| to obtain J, 2gdu < liminf, fy 2¢—|f, —fldu = fy 2gdu — 
limsup,, Jy lf, —fldu. Hence limsup, fylfi—fldu <0, so limsup, fylfi- 
fldu =0. Since f,|f,—fldu is a nonnegative sequence, lim, Jy |fn—fldu = 0, 
as desired. 


Example 7. Let fE 2'(u). Then, for every € > 0, there exists 6 >0 such that 
whenever [(E) < 6, Jf, |fldu <e. 

Suppose there exists a number € > 0 such that, for every n EN, there is a 
measurable set E,, such that u(E,,) < 27”, and Se, [fldu = €. Let Fy = UnspEns 
and let F = ng, F;. On the one hand, u(F,) < yo" = 2-1; hence u(F) = 
lim, UF.) = 0, and f,,|f|du = 0. On the other hand, by the dominated con- 
vergence theorem, f,|f|du = lim, fj, |fldu 2 liming, f, |fldu 2 €. This con- 
tradiction establishes the result. @ 


Exercises 
In the problems below, (X, Mi, {z) is a measure space. 
1. Let fbe a measurable function, and let g be a function such that f(x) = g(x) 
for a.e. x € X. Prove that g is measurable. 


2. Define a relation on the collection of measurable functions as follows: f= g 
if f(x) = g(x) for a.e. x € X. Prove that = is an equivalence relation. 
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3. Let fbe an integrable function, and let g be such that f(x) = g(x) a.e. Prove 
that g is integrable and that f,. fdu = fy gdu. 

4, Let fE2'(u), and let E={xEX: |f(x)|>c}, where c>0. Prove 
the inequality (Tchebychev) “(E) < ; Jz \fldu. More generally, if f is 
measurable and | f|? € 2(2), then u(E) < < Se |flPdu. Herel <p<oo. 

5. Let fe 2'(u). Show that the set E = {x € X : f(x) # 0} is a countable union 
of sets of finite measure. 

6. Let fbe a positive measurable function. Show that if E and F are measurable 
sets such that u(EAF) = 0, then f, fdu = fi, fdu. 

7. Let f,, be a sequence of measurable functions such that ae Se ltnld# < 00. 
Show that the series ee |f,(x)| converges a.e. in X. 

8. Show that if is a finite measure and (f,,) is a sequence of bounded mea- 
surable functions such that f,, converges uniformly to f, then lim, /,|f, — 
fidu =0. 

9. Let f € 2'(). Prove that for every € > 0, there exists a set E of finite measure 
such that J, |f|du > || Ali —e- 

10. Let (f,,) be a decreasing sequence of nonnegative measurable functions, and 
let f = lim,,f,,. Show that if f, is integrable, then f, fdu = lim, fy f,du. 


8.4 Lebesgue Measure on R” 


This section is the centerpiece of the chapter. The motivation for the definition of 
the Lebesgue measure, as well as an extensive development of its properties, appear 
later in the section. We must furnish some needed background. The four leading 
results in this section are valid for locally compact Hausdorff spaces, and this is 
made abundantly clear in the excursion on Radon measures. We chose to limit the 
bulk of the section to the Lebesgue measure because we do not wish to base this 
section too heavily on chapter 5. 


Preliminaries 


Lemma 8.4.1 (Urysohn’s lemma). Let E and F be disjoint closed subsets of R". 
Then there exists a continuous function f : IR" — [0,1] such that f(E) = 1, and 


fF) =0. 


Proof. The functions g(x) = dist(x, F) and h(x) = dist(x,E) are continuous and are 
never simultaneously zero since E and F are closed and disjoint. Furthermore, 
g(x) > 0 for every x € E, and h(x) > 0 for every x € F. 


Th ti =e 
e function f(x) ox) +h(x) 


has the stated properties. 
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Lemma 8.4.2. Let K be a compact subset of an open subset V of RX". Then there exists 
an open set U such that U is compact and KC UC UC V. 


Proof. For every x € K, there exists a ball B(x,6,) such that B(x,d,) € V. Since 
K is compact, and K C UyexB(x,6,), there exists a finite number of points 
X15+++5Xim © K such that K C U2, B(x;,6,,). The set U = Uj B(%;, 6,,) satisfies the 
requirements. 1 


Definition. Let f: R” — C be a continuous function. The support of f, written 
supp(f), is the closure of the set {x € R” : f(x) # 0}. A continuous function f: 
IR" > C is said to be of compact support if supp(f) is compact. 


We use the notation C,(IR”) to denote the set of continuous, complex-valued 
functions of compact support on R”. Clearly, C,(IR”) is a vector space. We also use 
C1(IR") to denote the set of continuous, real-valued functions of compact support 
on R”. 


Notation. Let Kbea compact subset of R”, and let Vbe an open subset of R”. Fora 
function f € CZ(IR"), we write f< V to mean that 0 < f < 1 and supp(f) C V. We 
use the notation K < f to mean that 0 <f< 1 and f(x) = 1 for all x € K. Many 
books refer to the following result as Urysohn’s lemma. 


Lemma 8.4.3 (Urysohn’s lemma). Let K be a compact subset of an open subset V 
of R". Then there exists a function f € C{(IR") such that K <f < V. 


Proof. By lemma 8.4.2, there exists an open set U such that U is compact and 
KQCUCUCV. Applying lemma 8.4.1 with E = K and F=R" — U, we find the 
function we seek. 


Lemma 8.4.4. Suppose K € R” is compact, and let {V,,..., V,,} be an open cover of 
K. Then there exist continuous, compactly supported functions h,...,h,, such that 
h, < V; and (h, +... +h,,)(x) = 1 for allx € K. 


Proof. First we show that there exists an open cover {U,,...,U,,} of K such that 
each U; is compact and U; C V;. The proof is by induction on m. When m = 2, 
let K, =K—V,. Then K, is compact and contained in V,. By lemma 8.4.2, 
there exists an open set U, with compact closure such that K, C U, © U;, C Vj. 
Clearly, {U,,V2} is an open cover of K. Now let K,=K—U,, and repeat 
the above argument to find an open set U, with compact support such that 
K, C U, CU, CV). Clearly, {U,, Uy} is an open cover of K. This proves the base 
case when m = 2. We outline the inductive step. Let {V,,..., Vn} be an open cover 
of K, and write W = V,U...U V,,. Then {V,, W} is an open cover of K. By what 
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we already established, there are open sets U, and W, with compact closures such 
that KC U,UW,, and U, CV, and W,; CW=V)U...UV,,. Now apply the 
inductive hypothesis to the compact set W, and its open cover {V,..., Vint- 


By lemma 8.4.3, there exist functions g; € CZ(IR") such that U; < gj < V;. Define 


hy = g1,hy = (1-8) 05 im = CL = 81 )- = Sin) 8m 


The fact that h; < V; is obvious. Simple induction shows that, for 2<i<m, 
hy +...+h, =1—-(.—g,)...1 —g;). Now define h=h, +...+h,,. Thus h=1—- 
(1 — gj)...1 — 8m). Ifx € K, then x € U; for some i, so g(x) = 1, and h(x) = 1. 


The functions /,,...,h,, in the above lemma are called a partition of unity on K 
subordinate to the open cover {V;}/2,. 


Dicing R” 


For a fixed natural number k, consider the following partition of R: 


v v+i 
[= eee EZ. 
This partitions each interval [m,m + 1)(m € Z) into 2* congruent half-open inter- 
vals, each of length —, 
The above partition of R can be employed to partition R” into a collection of 
half-open cubes: 
v Vy, +1 
$= {0 =[2,™4 


Vy Vat 
2k? 2k 


*) (Vy. %,) EZ". 


)x..X[ 


Note that, for g € S;, diam(c) = fn2-* and that if o and a’ are distinct cubes in 
S,, thenana’ = @. 


Observe that the half-open unit cube [0,1) x... x [0,1) is the union of 2” cubes 
in S;,, that S,,, is a refinement of S;, and that each cube in S, is the union of 2” 
cubes in 5,44. 


Now, given an open set VC R”, let 
§,(V) = {a ES; : ac V}, and G.= Ufa [OE S,(V)}. 


Note that G, € G, C.... We claim that V= U72, G,. Clearly, VD UZ, Gy. Con- 
versely, if x € V, there exists 5 > 0 such that B(x, 26) € V. Choose k € N such that 
fn2-* < 6. (Reminder: fn2-* is the diameter of a cube in S,.) Since R” = Ufo : 
a €§,}, x €oa for some o € S,. Since diam(a) < 6,0 € B(x, 26) C V. This proves 
that x € G, and that V= URL, G,. 
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(a) (b) 


Figure 8.2 (a) S3(U) (b)S,(U) 


This construction should be geometrically obvious. The set 5,(V) is the largest set 
of cubes in S;, that fits inside V. It is also clear that S;,, ,(V) is a refinement of S,(V) 
that also contains all the additional cubes in S;,,, that fit in V. Figure 8.2 illustrates 
the construction: figure 8.2(a) depicts all the squares of length 1/8 that fit in the 
unit disk U, and figure 8.2(b) shows all the squares of length 1/16 that fit in the 
disk. The union of the squares are G;(U) and G,(U), respectively. 


Lemma 8.4.5. Let V be an open subset of RR". Then V is the countable union of 
disjoint cubes o of the type discussed in the previous paragraph. More specifically, 
V=U2)q;, where 0, € UZ, S, and a;N0, = © ifi #j. 


Proof. Let B, = G,, and, fork > 1, let By4; = Gy, — Gy. The family {B,} is mutually 
disjoint, and U7, By, = V. Each By, is the union of cubes in S,. The collection of 
all such cubes is countable, and their union (over k € N) is V. Renumbering those 
cubes as 0,,03,..., we obtain V = Ure. 0. Finally, consider two distinct cubes, ; 
and g;,. If o, € B,, where 0, C B, andr # s, then o, 0, = © because B, NB, = © 
ifr # s. If; and G; are subsets of B,, for some integer r, then 0,0; = © because 
the cubes in S, are disjoint. 


As an illustration of the above construction, figure 8.3 depicts the set 
B,(U) = G,4(U) — G3(U) 


for the unit disk U. 
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Figure 8.3 B,(U) shown as the union of the unshaded squares 


Lebesgue measure: Motivation and Overview 


This subsection is included for the sole purpose of building the reader’s intuition. 
It is not meant to be a rigorous development of any particular set of ideas. 


It must be emphasized at the outset that the Lebesgue measure is not an artificial 
construct but rather a very natural kind of measure, as the reader will see below. 
The broad goals are intuitively clear; we wish to find a large enough o-algebra 
£" in R" and a positive measure A on £" that extends and is consistent with our 
common geometric perceptions about length, area, and volume. It is therefore 
entirely reasonable to expect (indeed, require) that every closed box Q must 
be in £” and that the Lebesgue measure of such a box must be the product of its 
dimensions, consistent with our definition of the volume of a closed box in section 
8.1. Surprisingly, those two simple requirements allow us to achieve most of our 
broad goals. Because every open subset of R” is a countable union of closed boxes, 
every open subset of R” must be in £"; hence £” contains all Borel subsets of R”. 
The requirement that 2(Q) = vol(Q) uniquely extends the Lebesgue measure to 
all open sets, as we explain below. 
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In theorem 8.4.5, if we define K; = i= 1 0}, then A(K;) = ae vol(G;), and it follows 
directly from theorem 8.2.3 that 


ACV) = limA(K;) (1) 


This discussion strongly suggests equation (1) as a possible definition of the 
Lebesgue measure of an open set.® However, we will take a different path. 


Another way to view equation (1) is as follows: Since, for any compact subset K of 
V, ACK) < A(V) and since there is a sequence K; of compact subsets of V such that 
A(V) = lim; A(K;), it must be true that, for an open set V C R’”, 


A(V) = sup{A(K) : Kcompact, K € V}. (2) 


We will use a variant of equation (2) as the definition of the Lebesgue outer 
measure of an open subset V of R”. However, this raises a serious question: why 
would we abandon equation (1), which defines A(V) in terms of the measure 
of a sequence of simple compact subsets of V, in favor of equation (2), which 
involves the measure of general compact sets? In other words, how do we define 
the measure of an arbitrary compact subset K of R”? The answer is, we do not! 
We use the Riemann integral as an instrument for the approximation of A(K) for a 
compact subset K of V, and this is why Urysohn’s lemma is crucially important for 
our development of the Lebesgue measure. Figures 8.4 and 8.5 illustrate the idea. 
In figure 8.4, the outer disk depicts the open set V, and the inner disk depicts a 
compact subset K of V. If fis a continuous function such that K < f< V, then the 
Riemann integral fp, f(x)dx can be regarded as an approximation of both A(K) 
and A(V). Figure 8.5 further illustrates the point. In that figure, the measure of K is 
the volume of the cylinder above K, which differs from fy, f(x)dx by the volume 
of the thin shell between the cylinder and the wall of the graph of f. Since we can 
construct a compact subset K that fills as much of Vas we wish (the compact sets K; 
in equation (1)), Jp.f(x)dx can be used to simultaneously approximate A(K) and 
A(V) with arbitrary precision. We hope that the preceding discussion motivates 
the definition below of the outer measure of an open subset of R”. 


® Equation (1) can be written more explicitly as A(V) = lim, ot This is a perfectly viable 
approach, and some recent books have adopted this as the definition of the measure of an open subset 
of R”. Observe that this definition accepts as a axiom the fact that the measure of the half-open cube is 
the product of its dimensions; hence the quantity = Another implied assumption is that all the cubes 


in S,(V) have the same measure. This is the seed of the translation invariance of the Lebesgue measure. 
See the proof of theorem 8.4.14. 
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Figure 8.4 A compact set K filling most of V 
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Figure 8.5 A function f such that K < f< V 
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In figure 8.4 we depict V as a bounded open set. However, the discussion points 
are valid even for open sets of infinite measure. Specifically, this means that if V 
is an open set of infinite measure, then V contains compact subsets of arbitrarily 
large measure. 


Lebesgue Measure 


As explained in the above motivation, the Riemann integral will play a pivotal role 
in our development of the Lebesgue measure. For a function f€ C,(IR”), let Q be 
a closed box that contains supp(f), and define 


i; f@dx= y fad. 


The Riemann integral is clearly a positive linear functional on C,(IR”). For the 
remainder of this section, we will use the following notation: for a function 
fe eR”), 


write I(f) = [ f(x)dx. 
Rn 
Definition. The Lebesgue outer measure is the set function 
m* : P(R") > [0, co], 
which is as follows: for an open set VC R", 
m*(V) = suptl(f) : f< V3}, 
and for an arbitrary set A C R”, 
m*(A) = infim*(V) : ACV, V open}. 


The definition of m*(A) requires some justification. It is a well-known fact that 
an open subset of R is the disjoint union of a countable collection {(a;,b,)}72, of 
open intervals. Therefore m*(V) = Ye: — a;). It follows, trivially, that m*(V) = 
inf, (0; —a;): VO UR, (a;,b;)}. While an arbitrary subset A of R” is not the 
countable union of open boxes, it can be covered by such a set of boxes. Therefore 
it makes sense to define m*(A) = inf, vol(Q;)}, where the infimum is taken 
over all the countable covers {Q;}*, of A by open boxes. Since the union of open 
boxes is an open set, and since every open set is the countable union of open boxes, 
the definition of m*(A) is well justified. 
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Proof. The monotonicity of m* is obvious. First we show that, for open sets V,,..., Vins 
m*(UiL, Vi) < ym" (V)). Let f< UjL, V;. By lemma 8.4.4, there exist func- 
tions h;< V;,1<i<m, such that that h,(x)+...+h,(x)=1 for every x€ 
supp(f). Now I(f) = (0, hf) = Dj, Wh) < Y, m*(V;). This shows that 
m*(Uj=Vi) $ Din mV). 

We now show that m* is countably subadditive. Let (E;) be a sequence of subsets 
of R". We must prove that m*(UE;) < oe m*(E;). pe m*(E;) = 00, there 
is nothing to prove, so assume that pi we) < oo. Let €>0, and choose 
open sets V; such that E,;C V; and m*(V,) < m*(E,)+¢/2'. Let V=U,V; 
and let f < V, that is, K = supp(f) CU, V;. The compactness of K produces a 
finite subcover V),...,Vin of K. Now I(f) < m*(V,U...U Vin) < ae m*(Vi) < 

a m(Vi) < > n* (ED) +¢/2')]= dees m*(E;) +€. Since the last inequal- 
ity is true for an arbitrary f < V,m*(V) < )),_, m*(E;) + €. Since UE; € Vand 
m* is monotone, m*(U?,E;) < m*(V) < pone m*(E;) + €. Because € is arbitrary, 
m*(US,E;) < y,_, m*(E;). a 


Example 1. The outer measure of an open interval V = (a,b) is b—a. 
For any function f such that f< V, it is clear that f° : f(x)dx < b—a. Thus 
m*(V) < b—a. Let gbe the continuous, piecewise linear function whose graph 
contains the points (a, 0), (a+ €,0), (a+ 2¢, 1),(b—2e, 1), and (b—€,0), (b,0). 
The function g is supported in V, and f” s g(x)dx = b—a—3e. Thus m*(V) = 
b-a.4 


Example 2. The outer measure of a point set {x} in R is zero. 
For any € > 0, the interval V, = (x—€,x+€) contains {x}, and m*(V,) = 2e. 
Since € is arbitrary, m*({x}) = 0. @ 


Example 3. The following facts follow directly from example 2. The outer measure 
of a countable subset of R is zero. The outer measure of the closed interval [a, b] 
isb-—a.¢ 


Example 4. The outer measure of the Cantor set is zero. 
In the notation of section 4.2, the Cantor set, C, is contained in the set C,, for each 
n&N. By the previous example and the subadditivity of m*, m*(C,,) < 27/3". 
Since m*(C) < m*(C,,) and since n is arbitrary, m*(C) = 0. @ 


Example 5. The two-dimensional outer measure of the set E = {(x,0) : x ER} 
(the x-axis) is zero. 

In a manner quite similar to that in example 1, one can show that the 

two-dimensional outer measure of an open rectangle is the product of its 
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dimensions. Let R, be the open rectangle (—n,n) x(—e/n°,¢/n*). Now 
m*(R,,) = 4¢/n’, and ECU®,R,. By the subadditivity of m*, m*(E)< 
m*(US_IR,) < ae <. Since € is arbitrary, m*(E) = 0. @ 

acer 3 


Definition. A subset E of R” is Lebesgue measurable if it satisfies the 
Carathéodory condition, for everyA C R", 


m*(A) = m*(ANE)+m*(ANE’). 


By theorem 8.2.7, the set £(IR”) of Lebesgue measurable sets is a o-algebra, 
and the restriction of m* to £(IR”) is a complete positive measure: the Lebesgue 
measure on L(R”). We will reserve the notation A(E) exclusively to denote the 
Lebesgue measure of a set E € £(IR"). We continue to write m*(E) for the outer 
measure of a set E whose Lebesgue measurability has not been established. We 
will frequently write £” as an abbreviation of £(R"). 


The immediate task now is to show that every open subset of R” is Lebesgue 
measurable (theorem 8.4.9). We first need to establish the finite additivity of m* 
for compact and open sets. 


Theorem 8.4.7. 
(a) If K is compact, then 


m*(K) = infll(f) : K < fh. 


In particular, compact subsets of R" have finite Lebesgue outer measures. 
(b) If K, and K, are disjoint compact subsets of R", then 


m*(K, UK,) = m*(K,) + m*(K)). 


Proof. Let K<f. If0 <a <1, then the set Vz ={x € R" : f(x) > a} is open and 
contains K. Now if g< Vq, then ag <f, and m*(K) < m*(Vq) = suptl(g) : g< 
Vah< ~i(f). Letting a > 1, we obtain m*(K) < I(f). Let €>0. There exists 
an open set V containing K such that m*(V) < m*(K)+ 6. Choose a function 
f€ CUR”) such that K < f < V. Then I(f) < m*(V) < m*(K) + €. This establishes 
part (a). 


To prove part (b), let €>0. By part (a), there exists a function g € C7(R") 
such that K, UK, <g, and I(g) < m*(K, UK,) + €. By lemma 8.4.2, there exists 
an open subset W with compact closure such that Ki CWC WCR"-K,. 
By theorem 8.4.3, there exists a function f € CIR") such that K, <f< W. In 
particular, f(K,) = 1, and f(K,) = 0. Note that K, < fg and that K, < (1 —f)g. 
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Now m*(K,) + m*(K,) < I(fg) + I(g—fg) = I(g) < m*(K, UK,) +€. Since € is 
arbitrary, m*(K,) + m*(Kz) < m*(K, UK). Now the subadditivity of m* delivers 
the result. 


Theorem 8.4.8. For an open set V, 


(a) m*(V) = sup{m*(K) : Kcompact, K € V3 
(b) m*(V) = sup{m*(U) : Uopen, Ucompact, U C V}, and 
(c) if V,, Vz are disjoint open sets, then m*(V, U Vz) = m*(V,) + m*(V)). 


Proof. Let a < m*(V). By the definition of m*(V), there exists a function f € CLR”) 
such that f < V and I(f) > a. Let K = supp(f). If K © W for some open set W, then 
f< W, so If) < m*(W). This shows that 


m*(K) = inftm*(W) : Wopen, K C W}> I(f) >a. 


Thus a < m*(K) < m*(V). This proves part (a). Observe that this proof is valid 
even when m*(V) = oo. 


Part (b) follows from part (a) and lemma 8.4.2. 


To prove (c), we may, without loss of generality, assume that V, and V, have 
finite outer measure. Let € > 0. By part (a), there exist compact sets K, and K, 
such that m*(V;) < m*(K;) + €/2,i= 1,2. The set K = K, UK, is compact, and 
K, UK, © V, UV3. Now m*(V,) + m*(V2) < m*(K,) + m*(K,) +€ = m*(K,U 
K,)+¢€< m*(V, U V2) +€. Since € is arbitrary, and m* is subadditive, m*(V,) + 
m*(V,) = m*(V, U V2). 


Theorem 8.4.9. Every open subset of IR" is Lebesgue measurable. Consequently, 
every Borel subset of R” is in £". 


Proof. Let U be an open subset of IR", and let A be an arbitrary subset of R". Since 
m*(A) <m*(AnNU)+m*(ANU’), we may assume that m*(A) < co. Without 
loss of generality, assume that ANU#A@AANU'. Let €>0. There exists an 
open set V containing A such that m*(V) < m*(A) + €/2. By theorem 8.4.8, there 
exists an open set W such that W is compact, WC VNU, and m*(W) +¢/2> 
m*(VN U). 

Let Wy) = VN(W)'. Notice that Wy) N W = @, that Wy) UW CG V, and that W, 
has finite outer measure. Now VM U' C Wy; som*(Wy) = m*(VNU’). Using this 
information and theorem 8.4.8(c), 
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m*(A) +e > m*(V)+€/2 > m*(WU W,) + €/2 = m*(W) + m*(Wo) + €/2 
>m(VAU)+m(VOU') > m*(ANU)+m*(ANU’). 


Since € is arbitrary, m*(A) >m*(ANU)+m*(ANU’). 
Theorem 8.4.10. Let EE £”. 


(a) For € >0, there exists a closed subset F and an open subset V such that 
FCEC Vand a(V —F) <e. 

(b) ACE) = sup{AC(k) : Kcompact, K C E}. 

(c) There exists an F, set A anda Gs set B such that A C E C BandA(B— A) = 0. 


Proof. R” is the countable union of the nest of compact balls K; = B(0,i),i€ 
N. For each iE N,A(K;NE) < co. Thus there exists open sets V; DK;NE 
such that A(V;—(K;NE,)) < €/2'*. Let V=US,V;. Then ECV, V-EC 
U2(V; —(K; NE)), and A(V—E) < €/2. Applying this result to E’, we can 
find an open set W containing E’ such that A(W-E')<eé/2. Let F= Wi’. 
Then FC E,E—F=W-E' and A(E-—F)=A(W-E’) < ¢€/2. Thus A(V—F) = 
A(V —E)+A(E- F) <€/2+€/2 =€. This proves part (a). 


If F is closed, F = U,(K; NF). Each K; NM F is compact and lim,A(K,N F) = ACF). 
Thus (b) holds for closed subsets of R". If EG £" and €>0, by part (a) we 
can choose a closed set FC E such that A(E — F) < €/2. If A(F) = ow, then sup 
{A(K) : Kcompact, K € E} > sup{A(K) : Kcompact, K C F}= 00. If A(F) <0, 
there exists a compact subset K of F such that A(F—K) <e/2. Now A(E) = 
A(K) + A(E — F) + A(F-K) <A(K) +. 


To prove part (c), find open sets V; and closed sets F,; such that F, CEC V; 
and A(V; — F;) < 1/i. Set A= US, F,, and B=N2,V;. Then A(B— A) < 1/i for 
every i € N; hence A(B — A) = 0. Observe that these results are valid even when 
A(E) = o. 


Observation. Let B” be the o-algebra of Borel subsets of R”, and consider the 
measure space (R”, B”,A). It is known that the restriction of 2 to B” is not a 
complete measure (see problem 4 at the end of this section). Theorem 8.4.10(c) 
implies that (R”,£",A) is the completion of (R”, B”,A). See problems 3 and 4 on 
section 8.2. The following corollary is an affirmation of the same fact. 
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Corollary 8.4.11. £” is the smallest o-algebra that contains B" and all sets of 
Lebesgue (outer) measure 0. 


Proof. We have already seen that B” C £" and that all subsets of Lebesgue outer 
measure 0 are Lebesgue measurable. We show that if MN is a o-algebra containing 
B" and all subsets of Lebesgue measure 0, then £L" CM. If EEL", then, by 
theorem 8.4.10, there exists an F, set A such that A C E and A(E— A) = 0. Thus 
E=AU(E-—A), where A € B" and A(E—A)=0. 


Recall from section 8.1 that the volume of a closed box 
Q={x ER” : a; <x; < b;} 
is, by definition, vol(Q) = IL, (: — aj). 


Lemma 8.4.12. For an open box Q= (a,,b,)X...X (a,,b,), 


AQ = vol(Q = Tb: 4). 


i=1 


Proof. Let f< Q. Then I(f) = fof(x)dx < f5ldx = vol(Q). Thus A(Q) < vol(Q). 
For small-enough positive constants €, define Q, =[a, +¢€,b, —€]x...x[a, + 
€,b, —€]. There exists a function f € CLR") such that Q. <f <Q. Therefore 
AQZM = fafa Jy, fd = fo, ide = voll Qe) = TT} (bi- a= 26). 
Since ¢€ is arbitrary, A(Q) > IL, (: —a,;) = vol(Q). @ 


Example 6. Consider an open box Q = (ay, b,) xX... X (a,,0,), and let Q be the 
closed box [a,,b,] x... x[a,,b,]. For every kEN, let Q, be the open box 
(a, — . b, + a) X..X(a,— : b+ -). Since {Q;} is a descending sequence and 
Q=NE Qe, AQ = lim, A(Q) = img TT} (6; - a, + =) = TT}, G;- 4) = 
vol(Q) =A(Q). Therefore 2(Q) = A(Q), and the boundary of any box has 
Lebesgue measure zero. Therefore the Lebesgue measure of any box (open, 


closed, or half-open) is the product of its dimensions. We will continue to refer 
to the Lebesgue measure of a box as its volume. @ 


Example 7. The Lebesgue measure of an open ball of radius r in R” is c,r”, where 
c, is the measure of the unit open ball in R”. 


Let B, be the open ball of radius r centered at the origin, and let B be 
the open unit ball. By lemma 8.4.5, B= U?2,9;, where {a;} is a sequence of 
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disjoint half-open cubes. Because B, = 1B = UjS,ro;, A(B,) = pe A(r9;) = 


Dn ACG) = r"A(B) = cr". © 


We are now ready to prove that the Riemann integral of a function of compact 
support is the same as its integral with respect to Lebesgue measure. 


Theorem 8.4.13. For a function f€ CR"), Jian fda = San f(x)dx. 


Proof. It is sufficient to prove the result for a positive function f. Let Q be a half- 
open cube containing supp(f) in its interior. Consider the partition of Q into 

nk 
2"* congruent, half-open sub-cubes {0,,..., Oy}, and let s,(f) = pre vol(a;) 
be the lower Riemann sums of f. Here fz, = min{f(x) : x €0;}. By theorem 
8.1.2, LiMy 505 SKF) = Sgn f(x)dx. On the other hand, the simple functions f,(x) = 


nk 
Di fo.o,(*) satisfy O<f, <fa <..., and lim f(x) = f(x) for every x ER". 
By the monotone convergence theorem, limg frnfedA = Sin fda. But fren fda = 


ye A(g;) = ee vol(a;) = s,(f). Therefore, limy s;(f) = Jpn fda. 


The previous theorem is commonly cast in the following language: the Lebesgue 
integral extends the Riemann integral from €,(R") to 2'(R",£",A). We also 
say that the Lebesgue measure / represents the positive linear functional I(f) = 


Sian f (x)dx. 


Theorem 8.4.14. The Lebesgue measure is translation invariant. Thus if EE £", 
and x € R", then A(E+x) = A(E). 


Proof. It is easy to see that vol(Q +x) = vol(Q) for every box Q. Now let V be an 
open subset of R". By lemma 8.4.5, we can write V as a disjoint union of half- 
open cubes, V = U,0;. Thus V+ x = UZ, (G, + x), and A(V +x) =AUZ (G+ 
x)= YA; +x)= > 4G) = 2(V). Thus the result holds for open subsets 
of R". The general result for an arbitrary measurable set E follows from the special 
case we just established and the fact that A(E) = inffa(V) : E C V, Vopen}. See 
the definition of the Lebesgue outer measure. Wl 


We now summarize the properties of Lebesgue measure. 


Theorem 8.4.15. Lebesgue measure is a complete, translation-invariant measure on 
£", and 


(a) every Borel subset of IR" is Lebesgue measurable; 
(b) for every open set V, A(V) = supt fix, f(x)dx : f< V3; 
(c) for every EE £", A(E) = infta(V) : EC V, Vopen}; 


INTEGRATION THEORY 387 


(d) for every compact set K, A(K) = inf fianf(x)dx : K < fh; 
(e) for every E€ £", ACE) = sup{a(K) : K C E, Kcompact}, and 
(f) A extends (represents) the Riemann integral in the sense that 


[ S@)de= [ faa 


for every fE CCR"). 


Property (a) is by theorem 8.4.9. 

Properties (b) and (c) are the definitions of Lebesgue outer measure. 

Property (d) is theorem 8.4.7(a). 

Property (e) is theorem 8.4.10(b). 

Property (f) is theorem 8.4.13. 

Finally, the completeness of A is by theorem 8.2.7, and the translation invariance 
of A is by theorem 8.4.14. Hi 


Definitions. Let X be a locally compact metric (topological) space, and let IN be 
a o-algebra of subsets of X that contains all Borel subsets of X. 
A positive measure f£ on WM is said to be 


(a) outer regular if, for every E € M, u(E) = infu(V) : E C V, Vopen}, and 
(b) inner regular if, for everyE € MN, u(E) = sup{u(K) : K C E, Kcompact}. 


We say that yz is regular if it is both inner and outer regular. 


Lebesgue measure is outer regular by the very definition of the Lebesgue outer 
measure, m*. Theorem 8.4.10(b) states that Lebesgue measure is inner regular. 


We conclude this section with two uniqueness results that characterize Lebesgue 
measure. 


Theorem 8.4.16. Let fu be a regular measure on £" such that m(K) < oo for 
every compact subset K of R", and fy, fdu = Jan f(x)dx for every fE CR"). 
Then p=A. 


Proof. It is sufficient to prove that u(K) = A(K) for every compact set K. The result 
then follows from the regularity of A and \. 
Let € > 0. There exists an open subset V such that K € Vand w(V) < w(K) +€. 
Let fE CUR") be such that K<f<V. Then A(K) = fran XKdA < Jan fda = 
Stan f(x)dx = fran fdM < Stan Xvde = MV) < w(K) +e. Since € is arbitrary, 
A(K) < KCK). Switching the roles of A and pM, we obtain U(K) < A(K). 
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Theorem 8.4.17. Let ft be a translation invariant measure on £" such that 
c = u([0,1)") > 0. Then ft = cA, that is, u(E) = cA(E) for every EE £". 


Proof. Let Q be the half-open unit cube [0,1)". For a fixed kEN, partition Q 
into 2"* congruent half-open sub-cubes 0,,...,0,n in 8,. Each cube in 8, is a 
translation of any other cube in S,, therefore, by assumption, f(o;) = (o,) for 
i=1,2,...,2". Since 0,,...,gn are disjoint, c= u(Q) = 2"*u(a,). Alsoc =c.1 = 
cA(Q) = cx" Aa) = c2"2(a,). It follows that u(o,) = cA(o,); hence (a) = 
cA(a) for every cube o in S,. Since k is arbitrary, u(o) = cA(o) for any cube 
o CURL, S_. By lemma 8.4.5, an arbitrary open set V is a countable union of 
disjoint cubes in Up2.,S,. The countable additivity of A and produces u(V) = 
cA(V) for every open subset V. 

Now let EEL". For an open set VDE, MCE) < mV) =cA(V). Hence 
M(E) < c inffa(V) : V2 E, Vopen} = ca(E). To show that (E) > ca(E), we first 
assume that E is bounded. Choose a large enough open box Q, that contains E. 
Then u(Q)— (E) = w(O - E) < cA(Q — E) = AQ) — cA(E) = w(Q) — cA(E). 
Thus WCE) > ca(E). 

If E is unbounded, let B; be the open ball of radius i and centered at the origin. 
Then E = US, (EN B;), and w(E) = lim; u(EN B,;) =c lim;A(ENB,) = cA(E). 


Excursion: Radon Measures 


A close examination of the constructions and the results of this section so far 
reveals that most of the theory we developed can be extended to locally compact 
Hausdorff spaces. Specifically, if R” is replaced with a locally compact Hausdorff 
space X and the Riemann integral is replaced with a positive linear functional I on 
C(X), then we can construct a measure ju that represents J and has most (but not 
all) of the regularity properties we derived for Lebesgue measure.’ 

The following results can be established by replicating the proofs of the corre- 
sponding results for the Lebesgue integral. Theorem 5.9.2 must be used instead of 
theorem 8.4.2, and lemma 5.11.6 instead of theorem 8.4.3. The proof we included 
for lemma 8.4.4 is valid for any locally compact Hausdorff; hence we state the 
lemma below for the sake of completeness. We urge the reader to scrutinize our 
claim that the proofs of the theorems below for Radon measures are identical to 
those provided for the Lebesgue measure. The exercise is illuminating. 

Throughout this subsection, X is a locally compact Hausdorff space, and I is a 
positive linear functional on @,(X). Explicitly, for fig € C,(X), anda, 6 € K, I(aft+ 
Bg) = al(f) + BI(g), and if f> 0, then I(f) > 0. Observe that such a functional is 
monotone in the sense that if f< g, then I(f) < I(g). 

We continue to use the notation K < f< V to indicate that f€ Ci(X),0<f<1, 
J(K) = 1, and supp(/) ¢ V. 


7 In the sense that f, fdu = I(f) for all fe C,(X). 
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Lemma 8.4.18. Suppose K € X is compact and that V,,..., V, are open subsets of X 
such that K C Uj, V;. Then there exist functions hy,...,h,, such that h; < V; and 
(hy +... thy)(x) = 1 for allx € K. 


The Radon outer measure induced by the positive linear functional J is the set 
function 

m* : P(X) > [0,00], 
defined as follows: for an open set V C X, 


m*(V) = suptl(f) : f < V3}, 
and for an arbitrary set A C X, 
m*(A) = inftm*(V) : ACV, V open}. 
Proposition 8.4.19. The set function m* is an outer measure on X. 


Definition. A subset E of X is Radon measurable, or simply measurable, if it 
satisfies the Carathéodory condition: for everyA C X, 


m*(A) = m*(ANE)+m*(ANE’). 


By theorem 8.2.7, the set IM of measurable sets is a o-algebra, and the restric- 
tion of m* to IN is a complete positive measure: the Radon measure on IN 
induced by I . We will reserve the notation j(E) exclusively to denote the p- 
measure of a set E € 9M. We continue to write m*(E) for the outer measure of a 
set E whose Radon measurability has not been established. 


Theorem 8.4.20. 
(a) If K is compact, then 


m*(K) = infl(f/) : K < fh. 


In particular, compact subsets of X have finite outer measure. 
(b) If K, and K, are disjoint compact subsets of X, then m*(K, U K,) = m*(K,) + 
m*(K,). @ 


Theorem 8.4.21. For an open set V, 
(a) m*(V) = sup{m*(K) : Kcompact, K C V}, 


(b) m*(V) = sup{m*(U) : Uopen, Ucompact, U € V}, and 
(c) if V,, Vz are disjoint open sets, then m*(V, UV) = m*(V,) + m*(V,). 
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Theorem 8.4.22. Every open subset of X is Radon measurable. Consequently, every 
Borel subset of X is in MN. Hl 


We now arrive at the main distinction between the Lebesgue measure and general 
Radon measures. Part (a) of theorem 8.4.21 states that uw is inner regular on 
open subsets of X. Inner regularity does not extend to all Radon measurable sets, 
however. But we do have the following result. 


Theorem 8.4.23. IfE € IN and u(E) < oo, then u(E) = sup{u(K) : K C E, Kcompact}. 
Thus is inner regular on sets of finite measure. 


Proof. Let €>0, and choose an open set UDE such that u(U)< MCE) +e. 
Since u(U — E) = w(U) — WE) <e, there exists an open set VD U-—E such 
that u(V) <¢. By theorem 8.4.21, U contains a compact subset H such that 
H(U) < uw(H)+€. The set K= H-—V is clearly compact, and KCU-VCE. 
Now 


M(E) < @(U) = MCU) — KH) + K(A) 
= w(U— H)+ w(H— V)+ e(An V) <€4+ MK) + ECV) < w(K) + 2¢. 


We now prove the generalization of theorem 8.4.13. 
Theorem 8.4.24. For a function f€ C(X), J fdu = I(f). 


Proof. It is enough to prove that 
If)< [ses every fe CX) (3) 
xX 


because we then have —I(f) = (—f) < fy —fdu = — J, fdu. 

It is further sufficient to establish (3) when 0 <f <1. Let K= supp(f), and let 
n be an arbitrary positive integer. For notational convenience, set € = 1/n. For 
0 <i<ni let y; =i/n, and define E, = f~'((0,y,]) NK and E; = f-'(y;_},y;] for 
2 <i<n. Clearly, the sets E, are disjoint and Uj_,E; = K. Since f is continuous, 
f-\(B) € B(X) for every Borel subset B CR; see problem 16 on section 8.2. In 
particular, the sets E; are in IN. Because y; —€ = y;_; <f(x) for all x € E;, the 
simple function s = ERC? —€)Xg, satisfies 0 < s <f. Therefore 


Soi ele) < i, fa. (4) 
x 


i=1 


For 1<i<n, choose open subsets V; DE; such that u(V;) < MCE;) +¢€/n and 


f(x) < y, + € for all x € V;, and let {h;} be a partition of unity of K subordinate 
to {V;}. Since h; < V;, 


INTEGRATION THEORY 391 


I(hj) < MCV) < M(E;) + €/n. (5) 


Since f= yg and since hf <(y;+¢€)h; we have (using inequalities (4) 
and (5)), 


If) = hf) < YO + OMA) < D0; + )(uE) + €/n) 


i=1 i=1 i=1 


= 910i — )uCE,) + 2eu(K) + €/n Diy; +) 


i=1 i=1 


< [ite + 2eu(K) +¢/n xe +e)= | fdut2eu(K)+e(1 +¢). 
x x 


i=1 
This establishes inequality (3) because n is arbitrary. 


The following theorem summarizes the properties of Radon measures. 


Theorem 8.4.25. Suppose X is a locally compact Hausdorff space, and let I be a 


positive linear functional on C(X). Then the Radon measure induced by I is a 
complete measure on M, and 


(a) every Borel subset of X is Radon measurable; 

(b) for every open subset V u(V) = suptI(f) : f< V} 

(c) wis outer regular; 

(d) wis inner regular on open sets and sets of finite u-measure; 

(e) for every compact set K, U(K) = inflI(f) : K < fi, and 

(f) A extends (represents) I in the sense that J, fdu = I(f) for every f € C.(X). 


Additionally, uu is unique, subject to these properties. 


To prove the uniqueness part of this theorem, mimic the proof of theorem 8.4.16. 
Observe that the proof of theorem 8.4.16 is based only on the outer regularity of 
the measures in question and their inner regularity for open sets. 


The following theorem provides a sufficient condition for the inner regularity of 
Radon measures (for all E € Mt). The proof is identical to that of theorem 8.4.10. 


Theorem 8.4.26. Suppose X is an o-compact, locally compact Hausdorff space, and 
lett EE M. 
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(a) For € > 0, there exists a closed subset F and an open subset V such that F € 
EC Vand u(V—-F) <e. 

(b) wCE) = sup{u(kK) : Kcompact, K C EF}. 

(c) There exists an F, set A and a Gg set B such that ACECB and 
u(B-A)=0. i 


Exercises 


1. Let f€ C,(IR”). Prove that fis uniformly continuous. 

2. Prove that every countable subset of R” has measure 0, and find an example 
of a subset E of IR” such that A(E) = 0 but AWE) = oo. 

3. The goal of this exercise is to show the existence of non-Lebesgue measur- 
able subsets of IR. Complete the following sketch of the proof. Define an 
equivalence relation + on R by x = y if x—y € Q. Each equivalence class 
of ~ intersects the interval [0, 1/2]. Let P be a subset of [0, 1/2] containing 
exactly one element from each of the equivalence classes of +. Enumerate 
the rational numbers in [—1/2,1/2] as {r, : n & N}, and let P, =7, +P. 
Show that the family {P,, : n € N} is disjoint and that the union A = U?|P,, 
satisfies [0, 1/2] CA C [—1/2, 1]. If P were measurable, then 1/2 < A(A) < 
3/2. But A(A) = Y)_ ACP,). 

4. Show that not every Lebesgue measurable set is a Borel set. Construction: 
Let C be the Cantor set, and define a function f : [0,1] > C as follows: 
f(0) = 0, and, for x € (0, 1], write x = Mi = and set f(x) = ye, a8 The 
function f is measurable by problem 18 on section 8.2, and is one-to- 
one by theorem 4.2.18. Choose a subset P of [0,1] which is not Lebesgue 
measurable. The set A = f(P) is the set you need. Recall that the Cantor set 
has measure 0; see example 4. 

5. Use the fact that the Cantor set has measure 0 to show that Card(L(R)) = 
2°. It can be shown that Card(B(R))=c. Thus there are many more 
Lebesgue measurable subsets than Borel subsets of R. 

6. Let V= {(4,...,x,) € R" : x, = 0}. Prove that A(V) = 0. 

7. Compute the Lebesgue measure of the set {x € (0, =) : sin(1/x) > 0.} 

8. Let E bea subset of R”. 

(a) Prove that E is Lebesgue measurable if and only if, for every € > 0, there 
exists an open set V containing E such that m*(V— E) <. Hint: The 
necessity of the condition is by theorem 8.4.10. To prove the sufficiency, 
use the identity ANE’ = (AN V’)U(AN(V—E)). 


* We use the series representation of x if x has a terminating binary expansion. 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 
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(b) Prove that E is Lebesgue measurable if and only if, for every € > 0, E 
contains a closed subset F such that m*(E— F) <e. 

The importance of this problem and the next is that they provide more 

intuitive characterizations of Lebesgue measurability than the Carathédory 

condition does. Intuitively, a subset E of R” is Lebesgue measurable if it can 

be approximated from the outside by an open set or from the inside by a 

closed set. 


. Let E bea subset of R”. 


(a) Prove that E is Lebesgue measurable if and only if there exists a Gg set 
G containing E such that m*(G— E) = 0. 

(b) Prove that E is Lebesgue measurable if and only if E contains an F, set 
F such that m*(E — F) = 0. 

In this exercise, we use A, to denote the Lebesgue measure on R*. Let rand 

s be positive integers, and let n = r+. Prove that if UC R" and V C R* are 

open sets, then 1, (Ux V) = A,(U)A,(V). Hint: 8§,(U x V) = 8) x §,(V). 

Let r,, be an enumeration of Q, and let G= UR, (7, - —,r, + — -). Prove 

that A(GAF) > 0 for every closed subset F of R. He Show. that if A 

(G—F)=0, then F=R. 

Let fbe a continuous function in 2!(R”). Show that if lim) xj) 00 f(x) exists, 

then lim)... f(x) = 0. Also give an example to show that a continuous 

positive integrable function need not be bounded. 

Let f € 2'(R"), and let a € R" be fixed. Define (z,f)(x) = f(x — a). Show 

that fy, fda = frn(tf)dd. This is a familiar linear change of variables 

formula. Using more conventional notation, fp, f(x)dA(x) = Jen fx - 

a)da(x). 

For a subset E of R”, let —E = {—x : x € E}. Prove that E is measurable if 

and only if —E is measurable and, in this case, A(—E) = A(E). 

For r> 0 and E CR", define rE = {rx : x € E}. Prove that E is measurable 

if and only if rE is measurable and that A(rE) = r"A(E). 

Let fe 2\(R"). For r>0, define f,(y)=f(ry). Show that Jy, fda = 

r" fanf-da. Using more familiar notation, if x = ry, then dA(x) = r"da(y). 

Let X be an infinite-dimensional normed linear space. Prove that there 

does not exist a translation-invariant measure on B(X) that assigns finite 

measure to bounded sets in B(X). Hint: Use Riesz’s theorem to find a 

sequence {u,,} of unit vectors such that ||u; — u;|| 2 1/2. 


8.5 Complex Measures 


Complex measures do not really measure anything in the strict geometric sense of 
the word, but they do share the defining property of a positive measure, namely, 
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countable additivity. Although they are rather abstract, real and complex measures 
have applications in differentiation and probability theories, among many other 
applications. We study the notion of differentiating one measure with respect to 
another measure, and our main result is the Radon-Nikodym theorem, which we 
apply in section 8.6 to study duals of ? spaces. Although the section results are 
limited to the basics, example 2 and the section exercises significantly expand 
the scope of the section, where we introduce such topics as the total variation 
of real and complex measures, uniform integrability, and measurable dissections. 
The properties of the Radon-Nikodym derivatives are also explored in the section 
exercises. 


Definition. Let (X, M2) be a measurable space. A real measure on WM is a count- 
ably additive function v : Mt > R. Thus if (E,,) is a disjoint sequence in IN, and 
E=U,E,, then v(E) = YS ve): By definition, v(@) = 0. Observe that 
finite positive measures are real measures. 


In this definition, it is tacitly assumed that the series is absolutely convergent. We 
call the reader’s attention to the fact that, according to the above definition, v takes 
finite values, that is, v(E) = oo and v(E) = —oo are specifically not permitted.’ 


Theorem 8.5.1. Let (X, Mt, v) be a real measure space. 


(a) If (E,,) is a ascending sequence in MN, then v(UF,E,,) = lim, v(E,,). 
(b) If (E,,) is a descending sequence in M, then v(N_, E,,) = lim, v(E,,). 


Proof. The proofs parallel those of theorem 8.2.3 and are therefore omitted. Ml 


Definition. Let (X, 2, v) be a real measure space. A measurable set E is said to 
be a v-positive set (or simply positive) if v(F) > 0 for every measurable subset 
F of E. A measurable set E is said to be a v-negative set (or simply negative) if 
v(F) < 0 for every measurable subset F of E. A measurable set E is said to be 
v-null if v(F ) = 0 for every measurable subset F of E. 


Clearly, if a measurable set E is both negative and positive, then E is v-null. 


Warning. Monotonicity does not hold for real measures. It is possible for a set of 
positive measure to contain a subset of negative measure, and conversely. Mono- 
tonicity does hold, however, for positive and negative sets: if F is a measurable 
subset of a positive set E, then v(F) < v(E). 


° This definition is not standard. Most books allow a real measure to take extended real values, co 
or —oo, but not both. The standard term used in this case is signed measure. 
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Proposition 8.5.2. A measurable subset of a positive set is positive, and the count- 
able union of positive sets is positive. The corresponding statements are true for 
negative sets. 


Proof. The first assertion follows from the definition. To prove the second assertion, 
let (E,,) be a sequence of positive measurable subsets of X. Define A, = E,, 
and, for n> 2, let A, = E, —U"2) E;. The sequence A,, is disjoint, and each A,, 
is a positive set. Now let E C UR_,E, = URL,A,. Since v(ENA,,) > 0, VE) = 
ye W(ENA,) > 0. 


Lemma 8.5.3. Every set of positive measure contains a positive set of positive 
measure. 


Proof. Suppose, for a contradiction, that S is a set of positive measure that contains 
no positive sets of positive measure. We first establish the following: 


If A C Sand v(A) > 0, then there is a subset of B of A such that v(B) > v(A). 

(*) 

Since A is not a positive set, A contains a subset C such that v(C) < 0. Set B= 
A-—C. Then v(B) = v(A) — v(C) > V(A). This proves (*). 

Set A, =S, and let n, be the least natural number for which v(A, -—- —. By 


(*), A, contains a set B such that v(B)> v(A,). Let n, be the least paial 
number for which A, contains a set B such that v(B) > v(A,)+ and let A, 


be such a set. Continue inductively to construct a sequence of mania! numbers 
11,N2,..., and measurable sets A, D A, 2... such that nj, is the least positive 


integer for which Aj_, contains a set B with v(B) > v(Aj_,) + ~, and Aj is such 
nj 
a set. Now v(A3) > v(A,) + = v(A,) + See ae Inductively, 
ng n ng ny ny n3 


WA) > D, —. Define A = }2,A;. Then 00 > (A) = limjx(A)) = De, = In 


1 , 
particular, ye — is convergent, and lim,n, = co. Again by (*), A contains a 
J 


te ij 

subset B such that v(B) > v(A), and there is a natural number n such that v(B) > 

v(A) + +. But there is an integer j such that nj >n. Thus v(B) > v(A) + *> 
n n 


VW Aca) ~. This contradicts the definition of n; because B C A;_,. 
n 


Theorem 8.5.4 (the Hahn decomposition theorem). [f(X, WN, v) is a real measure 
space, then there exist a positive set P and a negative set N such that X = PUN 
and PN = @. The sets P and N are essentially unique in the sense that if (Q,M) 
is another pair of subsets of X satisfying the conclusion of the theorem, then PAQ 
and NAM are v-null sets. Here PAQ = (P— Q)U(Q-—P). 
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Proof. Let K = sup{v(E) : E € M,E positive}, let P,, be a sequence of positive mea- 
surable subsets such that lim, v(P,,) = K, and let P= U?_,P,,. By proposition 
8.5.2, P is positive; hence v(P) < K. Now v(P,,) < v(P) < K implies that K= 
lim, v(P,,) < v(P) < K. Therefore v(P) = K. Notice that this proves that K < oo. 
Let N= X — P. We show that N is a negative set, and this will prove the existence 
part of the theorem. If N is not negative, then it contains a subset S of positive 
measure. By lemma 8.5.3, S contains a positive subset G of positive measure. This 
contradicts the definition of K, since PU G would be a positive set and v(PU G) = 
v(P) + v(G) > v(P) = K. If the pair (Q,M) also satisfies the conclusion of the 
theorem, then P—Q=PMM is both positive and negative and hence P— Q is 
a v-null set. Similarly, Q—P is v-null and so is PAQ. One shows that NAM is 
v-null using an argument identical to this one. 


Corollary 8.5.5. A real measure is bounded. 


Proof. We use the notation of the proof of the previous theorem. Let k = v(N). Fora 
measurable set E, v(E) = v(ENP)+v(ENN). Since 0< V(ENP)<K, andk < 
V(ENN)<0,k<v(E)<K. 


Definition. Two positive measures v and 4 on a o-algebra M are called mutually 
singular if there exist disjoint measurable subsets Q and M such that QU M = X 
and u(Q) = 0= v(M). 


Theorem 8.5.6 (the Jordan decomposition theorem). If v is a real measure, then 
there exist unique, finite, positive, mutually singular measures v*+ and v~ such 
that, for every EE M, v(E) = vt (E) — v-(E). 


Proof. Let (P,N) be a Hahn decomposition of v, and define v* (E) = v(EN P), and 
vy” (E) = —v(ENN). The pair v* and v has the desired properties since vt (N) = 
0=v(P). If u* and uo is another pair satisfying the stated properties with 
bt (M) = 0 = 7 (Q), where QNM = @,QUM =X, then the pair (Q,M) is a 
Hahn decomposition of v and hence PAQ is v-null. Therefore, for E € M, 


Bt(E) = ut (ENQ)+ ut (ENM) = pA(ENQ) 
= ut (ENQ)—E-(ENQ = W(ENQ = (ENP) = v*(E). 


Thus u* = vt; hence uw =v". 


Definitions. The finite positive measures v+ and v~ are called the positive and 
negative variations of v, respectively. The finite positive measure |v| = v+ + v7 
is called the total variation of v. Notice that, for every E € M, |v(E)| < |v|(E). 
Define ||v|| = |v|@X). 
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Example 1. Let 2 be Lebesgue measure on R, and let £ be the set of Lebesgue 
measurable subsets of R. Let f(x) = xe, and define a set function v : L > 
R by v(E) = f, fda (E€ £). One can easily check that v is a real measure 
(also see theorem 8.5.7). The Hahn decomposition of v consists of the sets 
P=[0,00), and N=(—oo,0). Let A =(—1,2), and B=(-1,0), andlet C= 
(0,2). Notice that while BC A, A has positive measure and B has negative 
measure. Also C C A, but v(C) = (- xe~*dx > v(A) = rin xe-* + 06 xe*dx. One 
can see that vt(E) = Jin(0,00) 44,0" (E) = — Sin(—oo,0 f4A and that |v|(E) = 
J; |f|dA. In particular, v(R) = 0, while ||v|| = |v|(R) = 2. @ 


Definition. Let (X, 9) be a measurable space. A complex measure on Mt is a 
countably additive complex-valued function on M. 


Now let v be a complex measure, and let v, and v; be the real and imaginary parts 
of v, that is for E € IM, v(E) = v,(E) + iv,(E). Clearly, v, and v; are real measures; 
hence |[y,l| < 00, [Iv] <00, and |v(E)| <|»,(E)| + |v(2)] < lly + lly < oo. 
Therefore, complex measures, like real measures, are bounded. Notice that the 
set of complex measures contains the set of real measures and, in particular, the 
set of finite positive measures. The set of complex measures on a o-algebra M is a 
vector space under the obvious operations: for complex measures v and uu and for 
a complex scalar a, (v + u)(E) = v(E) + WCE), and (av)(E) = av(E\(E € M). 


The following theorem generalizes example 1 and provides a rich source of real 
and complex measures. 


Theorem 8.5.7. Let (X, Mi) be a measurable space, and let t be a positive (not 
necessarily finite) measure on MN. If h € B'(), then the following set function 
defines a complex measure on M: 


V(E) = [ia 
E 


Proof. Let {E,,} be a disjoint family of members of M, and let E = UF_,E,; hye = 


ue hXr, = lim,, a hXp,- Since [Se hxr,| < pe lhl Xz, < |h| € Q'(u), 
the dominated convergence theorem implies that 


v(E) = [ hyedu = [ > tte, du 

x Xn=1 
=lim | ))Ayedu =lim)) (E) = >) v(E,,). 
n xX n 


i=1 i=1 n=1 
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Definition. Let and v beas in theorem 8.5.7. The function h is called the Radon- 


Nikodym derivative of v with respect to 4, and we symbolically write h = .. 


or dv = hd, to indicate that v(E) = Jf, hdu. The following theorem justifies the 
definition and the notation. 


Theorem 8.5.8. Let 1 be a positive measure, let h € &'(u) be a positive function, 
and let dv = hdu. Then, for a positive measurable function f, 


[w= [rete 


In particular, if f € R'(v), then fh € B'(u) and ||f\li.y = WLAlliy- 


Proof. For a measurable set E, fy Xpdv =V(E) = f,hdu= fy Xphdu. Linearity 
guarantees that f,sdv= f,shdu for every simple function s. Now, for a 
positive function f, let s, be an increasing sequence of simple functions such 
that lim, s, =f. Then s,h increases to fh, and, by the monotone convergence 
theorem, f,fdv =lim, f,s,dv = lim, fys,hdu = ffhdu. The remaining parts 
of the theorem are obvious. 


Theorem 8.5.8 justifies the definition of the Radon-Nikodym derivative and the 
notation h = *. Indeed, this theorem can be stated using the notation f, fdv = 
1a 


fe fe du. Observe that the last formula is reminiscent of the change of variables 


formula. Problem 8 at the end of this section is what one might call the chain rule 
for Radon-Nikodym derivatives. 


Definition. Let jz be a positive (not necessarily finite) measure on a o-algebra Mt, 
and let v be an arbitrary complex measure on MM. We say that v is absolutely 
continuous with respect to pif, for every E € M, u(E) = 0 implies v(E) = 0. In 
this situation, we write v << UL. 


Notice that if IN, u,h, and v are as in theorem 8.5.7, then v << yu. The Radon- 
Nikodym theorem, in effect, is the converse of theorem 8.5.7. 


If v isa real measure and v = vt — v7 is the Jordan decomposition of v, then v << 
if and only if vt << wand v™ << yu. Also if vy = v, + iv, is a complex measure, 
then v << yifand only if v, and », are absolutely continuous with respect to u. We 
leave the details to the reader to verify. 
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Lemma 8.5.9. Let v and qu be finite positive measures on a o-algebra IN, and 
suppose that v << yt. Then there exists a positive number € and a set PE IM such 
that 


(a) UP) > 0, and 
(b) P is positive for the measure v—€ pA, that is, VEN P)—eu(EN P) > 0 for 
every EG Mi. 


Proof. Since 0 < v(X) < co and 0 < U(X) < ov, there exists a positive number € 
such that v(X) — eu(X) > 0. Let (P,N) be the Hahn decomposition of the real 
measure v — €j. Then P is positive for v — eu. If UCP) = 0, then v(P) = 0; hence 
V(X) — eu(X) = v(N) — eu(N) <0, since N is negative for v — ey. This contra- 
dicts v(X) — €u(X) > 0 and proves that u(P) > 0. Hl 


Theorem 8.5.10 (the Radon-Nikodym theorem—the real version). If v and 
are finite positive measures and v << yl, then there exists a unique positive 
function f € 2' (1) such that dv = fd, that is, 


VE) = [avo everyEEM. 
E 


Proof. The uniqueness of f follows from example 4 in section 8.3.'° Let & be the 
following set of measurable functions: 


S={fz0: [iuesvmveem, 
E 


Since f=0E $8 FO. Every fE & is p-integrable since f,fdu < v(X) < o. 
It follows that a = suptf, fdu : fe B} is finite. We first prove the fact that if 
ges, then h=maxff ghey. Lett A={xEX: f(x) > ex}, and B={xeE 
X: f(x) < g(x)}. For everyEEM, fphduz= fy .phdut fynphdu = fy .pfdu+ 
SonpSdk S$ VANE)+V(BNE)=v(E). Hence he %. By the definition of 
a, there exists a sequence g),Q),..€ & such that lim, f,g,du=«a. Let fi = 
Sis = Max{g,, Qo}, fy = MAX{Z,, 2, ..., Ln}. By the above fact, f, EB,0<fi < 
fr <..,andlim,, fy f,du = a. Set f(x) = lim, f,(x). By the monotone convergence 
theorem, J, fdu =lim, f, fide < v(E); hence fe %. Also, fy fdu =a. We claim 
that v(E) = f,.fdu for every E€ M. To this end, it is enough to show that the 
measure ¢(E) = v(E) — f,,fdu is identically equal to zero. If not, then ¢ would be 
a positive measure, and [ <<. By lemma 8.5.9, there exists a positive number 
€ and a set P with (P)>0 such that P is positive for ¢ —e. Thus, for every 


© More precisely, if f,du = f,dy for f, € 2'(q), then f, = fy, M-a.e. 
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EEM,C(E) = C(ENP) > eM(ENP) =€ f, Xpdu, and v(E)= f,fdut+¢(E) = 
SeG+ €xp)du. Thus f+ €xp € &. But this leads to the following violation of the 
definition of a: fft+ €xp)du = fy fduteu(P) > fy fdu= ac. 


Theorem 8.5.11 (the Radon-Nikodym theorem). If {u is a finite positive measure 
on M and v is a complex measure on M such that v << fu, then there exists 
a unique complex-valued function f€ 2) such that, for every EEM, 
vV(E) = f,fdp. 


Proof. If v is real and v << p, thenv+ << and v~ << yw. By theorem 8.5.10, we 
find positive 4-integrable functions f* and f~ such that, for every E € M, v*(E) = 
JS,f* du, and v-(E) = ff du. Thus v(E) = f,fdu, where f= ft —f- € 2'(w). 
If v is a complex measure, apply the result we just established to the real and 
imaginary parts of v, since each part is absolutely continuous with respect to p. 


As an application of the Radon-Nikodym theorem, we develop the definition of 
the total variation of a complex measure. 


Example 2. Let v be a complex measure, and let « be a finite positive measure 
such that v << y. Such yp exists; one can take, for example, u = |v,| + |»j|. 
By the Radon-Nikodym theorem, there exists a function f € &'() such that 
dv = fdu. We define the total variation of v to be the finite positive measure 
given by d|v| =|f|du. Notice that this definition is consistent with the result 
of problem 6(c) for real measures. We need to prove that |v| is well defined in 
the sense that it does not depend on the particular choice of 4“. Suppose that, 
for finite positive measures 4, and {2, and for functions f; € 2'(q), fidu, = 
foduy. Let € = 4, +My. Then € is a finite positive measure and py; << &. By 
problem 8 in the section exercises, feds = feds . By the uniqueness of 


the Radon-Nikodym derivative, Ae = Ae, §-ae. Since “# z ->0, [fil * = 


ies 7 as errs 7 =lhl= 3 , €-a.e. Now, fora chick set E, 


[ Lildu = i iil ae = [ ial a - i! Lhldir. 


Exercises 


In the following exercises, (X, Mt) is a measurable space and j/ is a positive measure 


on MN. 


10. 


11. 


12. 
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. Prove that if P and Q are positive sets for a real measure v such that PAQ 


is v-null, then v(EN P) = vV(EN Q) = VEN PNQ) for every measurable 


set E. 
1 


. Show that, for a real measure v, vt = (| +yv) and v~ = -(|v|—v). 


2 


. Let v be a real measure. Prove that if and 7 are finite positive measures 


such that vy = € —y, then >vt andy >v-. 


. Define the following function on the space of real measures on J: ||v|| = 


|v|(X). Prove that ||.|| is a norm. 


. Show that if v is a real measure on M, then v << wif and only ifvt << p, 


and vy~ << ywifand only if |v| << yu. 


. Let f € 2'(u) bea real-valued function, and define v(E) = f,, fdu, (E € M). 


Prove that 
(a) the pair (P, N) isa Hahn decomposition of v, where P = {x € X : f(x) > 
O}, andN={x EX : f(x) < 0}; 
(b) v*(E) = f, ft du, and v-(E) = — ff du; and 
(c) |v|(E) = f, fds using our notation for Radon-Nikodym derivatives, 
this result can be written as “”! = ( |. 
du du 
Definition. A subset § of 2'(2) is said to be uniformly integrable if, for 
every € > 0, there exists 6 > 0 such that /,,|f|du < € for every f € & and for 
every measurable set E with u(E) <6. 


. Prove that if (f,,) is a convergent sequence in 2'(), then {f,,} is uniformly 


integrable. Hint: See example 7 in section 8.3. 


. Let f,v, and be finite positive measures such that ¢ << v << yu. Show that 


& Kay 


: d duy_ 
= . Conclude that if v << uw << v, then = =(“)~! (u-or v-a.e.) 
du dv du du dv 


. Prove that if v is a complex measure, then, for every measurable set E, 


IE) < WI). 
For a complex measure ¥, define ||v|| = |v|(X). Prove that ||.|| is a norm on 
the space of complex measures on IN. 

Let v be a complex measure on Jt. Show that v << y if and only if, for 
every € > 0, there exists d > 0 such that, for every E € IN, u(E) < 6 implies 
that |v(E)| < ¢. Hint: See example 7 in section 8.3. 


Definition. Let v be a real measure on JN. A measurable dissection of E is 
a disjoint collection {E,,...,E,,} of measurable subsets such that E = UL, Ej. 


Prove that, for every E € M, |v|(E) = sup ee |v(E;)|, where the supre- 
mum is taken over all measurable dissections of E. The result also holds 
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when {E;} is a countable dissection of E. Hint: If (P, N) is a Hahn decom- 
position of v, then E; = ENP and E, = ENN is a measurable dissection 
of E. 


Definition. A positive measure “ on (X, Mt) is said to be o-finite if 
X=UR)X,, where X, € Mand u(X,,) < oo. We may, and often do, choose 
(X,,) to be a disjoint sequence. 


13. Prove that theorem 8.5.11 is valid when y is a o-finite positive measure and 
v is a complex measure such that v << wu. 

Here is a proof outline. It is sufficient to prove the result when v is a finite 
positive measure. For n EN, define two finite positive measures on IN as 
follows: 4,(E) = WEN X,) and v,(£E) = v(EN X,,). Show that v,, << u,. By 
theorem 8.5.10, there exist positive functions h,, € 2'(,,) such that dy, = 
h,,du,,. Without loss of generality, h,, vanishes outside X,,. Set h = yo hy. 
Argue that h € 2'(y) and that dv = hdu. 


8.6 L? Spaces 


In addition to the function spaces B(X), C(X), and BC(X), the L? spaces are 
prototypical examples of Banach spaces and play a prominent role in modern 
analysis. By far, the most important of the 2? spaces is the Hilbert space 
2(X,M, 4), where (X,M,W) is a positive measure space, such as a Lebesgue 
measure restricted to a subset X of R”. The section results parallel those for the 
sequence spaces I?. We prove the completeness of 2? and derive the representation 
theorem that, for 1 < p < co, 24 is the dual of 2’. In fact, the sequence spaces 
are special cases of the £? spaces. See problem 1 at the end of this section. The 
next section is a continuation of this one. 


Throughout this section, p > 1 and q > 1 denote conjugate Holder exponents; thus 
~ + + = 1. Itis understood that p =1,and q = ~ are conjugate exponents. 
Pp 4 


Lemma 8.6.1. For all x,y € C, 


(a) |xy| < “+ bel < p,q < ©. 
q 
(b) |x + yl? < 2P(|x|? + |y|?), 1 <p < oo. 


Proof. Part (a) was established in lemma 3.6.1. 
For part (b), |x + y| < |x| + |y| < 2 max{|x|, |y|}. Thus |x + yl? < 2?(max}|x|, |y]})? = 
2? max{|x|?, |y|P} < 2? (|x|? + |y|?). 
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Definition. Let (X, Wt) be a measurable space, and let 4 be a positive measure on 
M. For 1 < p < ov, we define L?() to be the set of all measurable functions f : 
X > C such that f, | f|Pdy < oo. If the measure yu is understood, we sometimes 
write 2 for (2). In anticipation of the fact that 2? is a normed linear space, 
we write 


l= ( i. inPan) orf 9"). 


Definition. We define 2(4) as follows. A measurable function f is said to be 
essentially bounded if there exists a positive constant M such that | f(x)| <M 
for almost every x € X. Thus fis bounded by M a.e. Such a constant M is called 
an essential upper bound of f The space 2°() is the set of all essentially 
bounded functions on X. For f € 2 (4), we define 


[Allo = inf(M > 0 : M isan essential upper bound of ff. 


We leave it to the reader to prove that ||fl|,, is an essential upper bound of f 
Thus || f]|,, is the least nonnegative constant such that | f(x)| < || fll. ae. 

Observe that if 0 < € < ||f]|,., then the set {x € X : |f()| > |I fll —e} has a 
positive measure. 


Theorem 8.6.2 (Hélder’s inequality). If fe 2’(u) and gE R41), then fee 
2"), and |All. < WAllollslla- 


FCI Ig a IFC? + ile 
IIfllp Iislla ~ p Ife = liglld 


bain 1 5 M11. ahenfne Sd 
ES ORAUT IIfllp 7 lla ——— fx | fel |d, HS => ma ; leila ? a erefore fe | fg| MS 
I[fllollglla- 


Iff EQ! and g € &, then f,\fgldu < Sy lf lllglloode = IIAllillglloo- Ml 


Proof. By lemma 8.6.1, 


. Integrating both sides, 


When p = 2 = q, Holder’s inequality is the familiar Cauchy-Schwarz inequality. 


Example 1. If f,g€ 2'(), then ¥/ | fg] € 2’. 
By assumption, the functions y|f] and +/|g| are in 2?. By the Cauchy-Schwarz 


inequality, | fg| € 21. @ 


Example 2. We show that f,~ “dx < ai 
Let f(x) = Ux, and g(x) =e~*. Then f and g are in 2((1,00)), and ||fl]. = 
1, |Igll. = ai The desired inequality now follows from the Cauchy-Schwartz 
inequality. @ 
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Example 3. If “(X) < oo, then, for p € (1, 00), W’(u) € B1(u), and, for fE L?(u), 
ILflli < ILAllp(#C))"4. In particular, if ¢(X) = 1, then ||flli < Ilfllp- 


Because “(X)< oo, the constant function g(x)=1€ 24. Using Hédlder’s 
inequality, we have fy|fldy < [[fllpllgllq = Ilflp(#@)"". @ 


Example 4. Suppose that w is a positive integrable function on R” and that 
Sign 0(x)dx = 1. If fis a measurable function such that |f|?a is Lebesgue inte- 
grable, so is |f|@, and 


1/p 
Fron [fd|a(e)dx < (Ja [foo|Pa(x)ax) . 
Here p € [1, 00). 


Define a finite measure on R” by du = wdA, where A is a Lebesgue measure. 
The measure yu and the function f satisfy the conditions of example 1; hence the 
result. @ 


Theorem 8.6.3 (Minkowsi’s inequality). For f,g€ 2, f+ ge 2 and ||f+ gll, < 
ILfllp + Ilglla- 


Proof. We leave the cases p= 1 and p= ow to the reader. Assume | < p < oo. By 


lemma 8.6.1, felftgldu < f.2(fl + |gl?)du = 2PCIslb + llglls) <0. This 
shows that f+ g © &?. To prove Minkowskis inequality when 1 < p < ov, notice 
that ifh € 2, then |h|P—' € 4 because (p — 1)q = p. Now 


f+ alle = [ Ir glad 
x 
2 if tel fteldu < ee Lt gL + Lt al? eld 
1/p 1/p 
<([) lia ( f+ gle- ms) "+( ier) a’) ( tele ma) 
= LF + gllh“Cfllp + llglly)s 


hence ||f+ gllp S Ilfllp + Ilgll- 


It is now easy to verify that 2? is a normed linear space for 1 < p < oo. 


Theorem 8.6.4 (completeness of 2?(4)). For an arbitrary positive measure pl, 
L?(L) is a Banach space for 1 <p < oo. 
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Proof. We do two cases. 


Case 1.1 <p < oo. We use the result of problem 10 on section 6.1. Let (f,) be a 
sequence in RP such that K = ae Ilfellp < 00. We show that the series Mik 
converges in B?. Define g,= >, |fel, and let g=y), |fx|- Then g, € @ 
and ||gnllp S ar Il fillp < K. By the monotone convergence theorem, fy? du = 
lim, Segre < K?. Thus g € 2. In particular, the series 4h) converges for 
a.e. x © X. Define f(x) = pC) Since |f| <g, fE L. Finally, we show that 
the sequence yak converges to f in 2°. Now |f— Deel? < (2g)?, and the 
dominated convergence theorem implies that lim, || f— Mads =0. 


Case 2. p = oo. Suppose for m,n > N,|| fn —finlloo < €. Let E, ={x €X : |f,(x0)| = 
Ifalleoks and En = EX : [fy(x)—fn(@I 2 Ilr —fnlleo} By definition of 
Il-Iloo» the set E = UE, UU n= 1 Em, n has measure 0. 

For mn>N, supi|f,(x)—fi,0d| xe X-—E}<e. Thus {f,} is a Cauchy 
sequence in the space B(X—E). Therefore, by 4.8.1, f, converges uniformly to 
some function f € B(X — E). Extend f to X by defining f(x) = 0 for x € E. Clearly, 
lf: Allo > 0asn—> co. 


Representation of Bounded Linear Functionals on 2? 


Definition. If gis a measurable function, the sign of g is the function 


@ , 
(sen(g))(x) = 4 le if g(x) #0, 
1 


otherwise . 
Notice that g.sgn(g) = |g| and that |sgn(g)| = 1. 


Theorem 8.6.5. Let 1 < p < c, and let g € B4(1). Then the functional 
®,(f) = Sx fgdu 


is a bounded linear functional on 2?(4), and ||®,|| = ||gllq- The same is true for 
p = 1 under the additional assumption that yu is o-finite. 


Proof. By Holder's inequality, |®,(f)| < Jy |fgldu < |fllpllgllq- Since the linearity of 
®, is obvious, this inequality shows that ®, is bounded and that ||®,|| < ||gllq- 
It remains to show that ||® || = ||gllq- 
If 1<p<oo, let f=|glt~'sgn(g). Then fe 2°(u), and ||fllp = Ilglli. Now 


Of) = Schade = fylgl'de = IIglld = Iiglld” ‘Iisll = UWfllpligll,- This concludes 
the proof of the case 1< p<. 
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If p = ov, set f= sgn(g). Then || flloo = 1, and fy fedu = fy glu = Ilglli- 


Now suppose that p = 1 and that pis o-finite. Let 0 < € < ||g||.9, and let E= {x € 
X : |g(x)| > |lglloo — €}. By definition of ||g||,9, HCE) > 0. Since fu is o-finite, X = 
UP 1X, where each X,, has finite measure. Since E = Uf, (EN X,,) and since 0 < 
M(E) < Y,- HENX,), M(ENX,,) > 0 for some integer n. Let A= ENX,, and 


let f= sgn(g)xa/u(A). Then fe B'(u),|Ifll = 1, and |®A)| = TS Sy leldu > 
IIglloo —c. a 


Theorem 8.6.5 establishes the fact that ® : g+> ®, is an isometry from 21(j) into 
(2°(2))*. Theorem 8.6.7 establishes sufficient conditions for ® to be an isometric 
isomorphism. First we need a technical result. 


Lemma 8.6.6. Suppose [(X) < oo. If g € 2'() and there exists a constant M such 
that | f,sgdu| < M||s||, for every simple function s, then g € 24. 


Proof. Because U(X) < oo, all measurable simple functions are in 2(u), and 
L(t) € LP) for all p > 1. We work out two separate cases. 


Case 1. 1<p<oo. First we show that | fy fgdu| < M||f\lp for every function 
fER~. To see that, let (s,,) be a sequence of simple functions that converges 
to fin B® (see theorem 8.3.2.) In this case, s, converges to f in X? for every 
p21. Now | fy fg—sngdel| < [I5n —flloollgll, > 0 as n> 0. Thus | fy fedul = 
lim, | fe Sngd| Slim, Mllsyllp = MIlflp 

We show that g € 21. Let E,, = {x €X : |g(x)| < n}, and let f = |g|1~'sgn(g) xp, 
Then fER™, fg=|gltxe,, and |flP =Igltxe,. Hence fy |gltdu = fy fgdu < 
MCS, |flPdu)? = M Se, |g|4du)'/?. Thus Sp, \g\4du)/4 < M. Taking the limit 
of the left side of the last inequality, the monotone convergence theorem yields 
IIgllq <M < ©. 


Case 2. p = 1. For every measurable set E, | f, gdu| < M||Zz||, = Mu(E). It follows 
that, for every measurable set E of positive measure, a J, gd is in the closed 


disk D of radius M and centered at the origin of the complex plane. We claim 

that |g(x)| <M for ae. x € X, that is, ||g||~o <M. We show that if B = B(z,1) 

is an open disk of radius r in C—D, then u(g~'(B)) =0. This will establish 

the claim, since C—D is a countable union of open disks. Let E = g~'(B). If 
1 1 1 

M(E) > 0, then iw Te8H — Z| = aoe Zy)du| < Pro: Ig—Zoldu <r. 


Thus e J,gdu € BN D = @. This contradiction completes the proof. 
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Theorem 8.6.7. If 1 is o-finite, then the function ® in theorem 8.6.5 is onto for 
l<p<o. 


Proof. Let p € (2°())*. We need to prove the existence of a function g € 24 such 
that p = ®,. Equivalently, for all f € L(y), 


pif) = i fgdu. (6) 


We first prove the result in the special case when [(X) < co. For a measurable 
set E, define v(E) = p( 7g). Since @ is linear and since, for disjoint measurable 
sets E, and E>, Xp,ug, = Xe, + Xe» VE1 U Ey) = v(E,) + v(E2.) Thus v is finitely 
additive. We show that v is countably additive, and this will establish the fact 
that v is a complex measure. Let (E,,) be a disjoint sequence in M, and let 
E=U%,E,. Let Ay =UL,E;. Then lim, u(A,) = u(B); hence ||7p — x4, llb a 
Sy le — Xa, Pde = MCE — A;,) > Oask > 00. Thus 7x4, converges to Xp in V(u). 
By the continuity of , lim, 9(74,) = (Xe). that is, ae v(E,,) = v(E). 

If u(E) = 0, then yz = 0 p-a.e.; thus v(E) = 0. 

The summary of the proof so far is that v is a complex measure and v << yu. By 
the Radon-Nikodym theorem, there exists a function g € &'(1) such that (7) = 
J, gdu for every measurable set E. The linearity of the functionals on the two 
sides of the last identity implies that 9(s) = f,,sgdu for every simple measurable 
function s. Now, for a simple function s, | f,sgdu| = |p(s)| < |l¢llllsllp- By the 
previous lemma, g € 24. The functional on the left-hand side of identity (6) is 
continuous on &? by assumption, and the functional on the right side of (6) is 
continuous by theorem 8.6.5. Since the two functionals agree on a dense subset 
of 2’, namely, the set of simple functions," identity (6) holds for all f € 8. This 
completes the proof of the theorem when [(X) < oo. 


Now suppose that U(X) = co and that X is the disjoint union of a countable 


sequence (E,) of sets of finite measure. Define h(x) = see ae y° By the 
- "HME, 

monotone convergence theorem, fy hdu= ee Zak = pear 1/2"=1. 
=i" Fn anu E, = 


Thus h € 2'(u). Let v be the finite positive measure such that dv = hdu. For 
1 <p <0, the correspondence F + h"’? F defines a linear isometry from Q°(v) 
onto R?() (theorem 8.5.8 is relevant here and for the rest of the proof). In 
particular, p(F) = 9(h'/? F) defines a bounded linear functional on Q?(v). By the 
first part of the proof, there exists a function G € 21(v) such that p(F ) = fy FGdy, 
for every F & Q(v). 


™ see theorem 8.7.3. 
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If 1<p<oo, define g=h'9G. By theorem 8.5.8, f,|g\idu= f,.|G|4dv < 


00; 


hence gEQ%X(yu). For fEeW(u), o(f)= Ph l’f) = fh-V?fGdy = 


Sho fGhdu = ff 1Gdu = Jf, fedu, as desired. 


If p = 1, define g = G. Because ||G| 00 = |IGl|v,00  € B*(w). If f € Bw), then 
Pf) = Ph "f) = fh 'fGdv = fch'fghdu = fyfedu. 


11. 


Exercises 


. Let u be the counting measure on N, and let f : N > KK. Show that f € 2?() 


if and only if f € I? as defined in section 3.6. 


. Show that, for an essentially bounded function f, || f]|,, isan essential upper 


bound of f. 


. Prove that if uw(X) < oo and1 <p<q<oo, then 24(2) C L(y). 
. Let f, be a convergent sequence in 2'(j), and let f=lim,f,. For € > 0, 


define E,- ={x EX : |f,(x) —f(x)| = e}. Show that lim, u(E, -) = 0. 


. Let fe L°(2), and suppose that p(X) < oo. Prove that lim, || fll, = || flloo- 
+1 
. Let fe 2° (2), and suppose that (X) < oo. Show that lim, (rh ILflloo- 


If'lh 


. Show that if pj,...,p,, > 1 are such that eae eee a and f; € 2, then 
Pm 


Pi 
ifn EB, and | f,--Fnlly = Wfillp,--lfallp,,- 


. Show that if fe 2?! and g € 2, then fg € L for some p. 
. Let f : X > [0, 00) be in 2, and let f,, = minf{f, m}. Prove that f,,, converges 


to fin W. 


. Show that iff, > fin 2? and g, > fin 24, then f,g, > fgin 2". Here p and 


q are conjugate Holder exponents. 
Let yx and v be finite positive measures such that v << fu << v. Prove that 
B°(u) = LV(v). 


8.7 Approximation 


In this section, we prove a large collection of approximation theorems. The high- 
lights include approximating 2? functions by simple functions and continuous 
functions of compact support. We prove that trigonometric polynomials are dense 
in 2?(—zr,7), which is the last piece of information we need to settle the question 
of the convergence of Fourier series of functions in 2?(—7:,71). The important 
operation of convoluting functions makes its first debut in this section. Finally, we 
study approximations by C® functions, prove the C® version of Urysohn’s lemma, 
and prove that C°(IR”) is dense in 2?(R"). 
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Lemma 8.7.1 (the Tietze extension theorem). Let K be a compact subset of R" and 
let f : K > [0,1] be continuous. Then f can be extended to a continuous function 
g €C,(R") such that 0 < g< 1. IfK is contained in an open set U, then g can be 
constructed in such a way that supp(g) € U. 


Proof. Let W be an open set such that W is compact andK C WC WCU Let K, = 
f-'({0, 1/3), and K, = f~'([2/3, 1]). Applying lemma 8.4.1 to the closed sets E = 
K, U(R" — W), and F= K, produces a continuous function g, : R" > [0,1/3] 
such that g,(E) = 0, and g,(F) = 1/3. By construction, supp(g,) © W, and0< 
f-—& <2/3 on K. Applying the same construction to the function f—g,, we can 
find a function g, : R" > [0, =.=] such that supp(g)) € W, and0 <f—gi-m < 


6; on the set K. Continuing this construction yields a sequence of continuous 
— i-1 

functions g; on R" such that supp(g;) C W, 0 <g; < —, and 0 <f—%, —..- 

RS (2 on K. The sequence G; = g, +... +g; is supported in W and is Cauchy in 


the uniform norm on the compact set W. Therefore G; converges uniformly to a 
: : : 2; 
continuous function g. SinceO <f—G;< Cy on K, g=f on K. Because each G; 


is supported in WC U, sois g. 


Remark. The Tietze extension theorem is valid for locally compact Hausdorff 
spaces. See problem 1 at the end of this section. 


Proposition 8.7.2 (Egoroff’s theorem). Let (X, MN, 4) be a finite measure space. 
Suppose the functions f and (f,)7-1 are measurable and finite a.e. and that 
lim, f(x) = f(x) for a.e. x EX. Then, for every positive number 6, there exists 
a measurable set E such that u(E) < 6 and f,, converges to f uniformly on X — E. 


Proof. First we show that, for every pair of positive real numbers € and 6, there exists 
a measurable set A and an integer N > 1 such that u(A) <6 and sup{| f,(x) — 
f(~)| 1 x Ee X—A} <e for every k> N. Define C, ={x EX : | f(x) —f(x)| < €} 
and let Dy =Np,Ch={x EX : |f(~) —f(x)| <€ for everyk >n}. Clearly, 
D, CD, C.... The set X —U%,D,, is contained in the set {x € X : lim, f,(x) 4 
S(O} which, by assumption, has measure 0. It follows that lim, u(D,,) = U(X). 
Therefore there exists a positive integer N such that u(X—Dy) <6. Set A= 
X — Dy. This proves our assertion because if x € A, then x € C; for every k> N, 
and |f,(x) — f(x)| < € for every k > N. 


For a fixed 5 > 0, and eachk EN, let 5, = 6/2*. Applying the above construction 
to the pair €, = 1/k, and 6;, we find a measurable set Aj such that u(A;,) < 5/2* 
and a positive integer n, such that sup{|f,(x) —f(x)| : x € X—A,}< 1/k for 
n> ny. Define E = US, Ay. Then u(E) < Y, MAd S Dy, 6/2 = 6. 
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Now let € > 0, and choose a positive integer k such that 1/k < €. 
Now, form>n, =N, 


Supt fine) —f(@)| + x E X— E} S supt| fine) —f)| + x EX — Ag} < Wk <e. 
Thus f,, converges uniformly to fon X— E. 


Before we proceed to the next theorem, we call the reader’s attention to the fact that 
2 contains all simple functions, while a simple function s = ae aiXp, belongs 
to 2? if and only if the support of s has finite measure, that is, if “(E;) < oo for all 
l<i<m. 


Theorem 8.7.3. Let (X,MN,) be a measure space. For 1<p< oo, the simple 
functions that belong to R?(4) are dense in 2P(2). 


Proof. We will show that if f € 2?, there is a sequence of simple functions s,, such 
that lim, || f—s,||p = 0. By theorem 8.3.2, there exists a sequence s},5>,... of simple 
functions such that |s,| < |s.| <... < |f| and lim, s,,(«) = f(x). 
Clearly, | f—s,| <2|f|; hence, for 1 <p < 00, |f—s,|? < 2? | f|P € Q'. By the dom- 
inated convergence theorem, lim, || f— Sp||p = 0. 
For p = oo, notice that if n> ||flloo, then, for a.e. x EX, 0 <f(x)—s,(x) < = 
(theorem 8.3.2) . Clearly, lim,, ||f—s,||,. = 0. 


Lemma 8.7.4. Let s = Dae, iXp, be a simple function on R", and let E= Viz) E;. 
It is assumed that E,,...,E,, are pairwise disjoint. If A(E) < co, then, for every 
€ > 0, there exists a function g € C(IR") such that A({x € R” : s(x) # g(x)}) <e, 
and ||glloo $ Ilslloo- 


Proof. Let U be an open set containing E such that A(;U—E) </2. For each 1 < 
i<m, let K; be a compact subset of E; such that A(E; — K;) < a and set H= 
U2 Kj. Notice that AE — H) < ¢/2. For 1 <i <n, define V; = U—Uj4;K;. By the- 
orem 8.4.3, there exist functions g; € C(IR") such that K; < g; < V;. Now define 
g= pane aig; Clearly, gE CR"), gly = sly and g vanishes outside U. The set 
{x ER" : s(x) # g(x)} is contained in the union of U—E and E—H, and the 
Lebesgue measure of each of the two sets is less than €/2. If \|g|| oo > |lSlloo> we 
modify g as follows to satisfy the last requirement of the theorem. Let S={x € 
C: |z| <||sl|.}, and T={zEC: |z| < |lg||.}. Define p : T> S by 


Zz ifzES, 
P(Z) = 7 Allsllc fzeT—S. 


Iz| 
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g is continuous,” and |\p(z)| =||s||.. for every z€T—S. Now let h= gog. 
Clearly, h(x) = 0 when g(x) = 0, and hence h € C,(IR"), and ||h|| 9 = ||5|loo- 


Lemma 8.7.4 is a very special case of the following well-known theorem, which 
says, loosely speaking, that a measurable function on a set of finite Lebesgue 
measure is not too far from being continuous. 


Theorem 8.7.5 (Luzin’s theorem). Let f : R” — C be a measurable function, and 
suppose that there exists a set E of finite Lebesgue measure such that f(x) = 0 for 
every x € R" —E. Then, for every € > 0, there exists a function g € C,(IR") such 
that the set {x € E : f(x) # g(x)} has a Lebesgue measure less than €. Moreover, if 
fis bounded, g can be chosen in such a way that ||g||,. < ||flloo- 


Proof. Let U be an open set such that EG Uand A(U — E) < é€/3. Let s; be a sequence 
of simple measurable functions such that |s,| < |s.| <... <|f| and lim;s(x) = 
F(x). Since f is supported in E, each s; is supported in E. By Egoroff’s theorem, 
there exists a subset A of E such that A(A) < €/3, and the sequence s; converges 
uniformly to f on E—A. By the proof of lemma 8.7.4, there exist compact sets 
H,; C E—A such that A((E — A) — H;) < se and functions g; € C(IR") such that 
and gi\n, = Sil, and each g; vanishes outside U. Now let K = N2,H;. Clearly, 
K is compact, and A((E — A) — K) < €/3. The sequence of continuous functions 
g; converges uniformly to f on K. Thus f|x is continuous. By the Tietze extension 
theorem, there exists a function g € C,(IR") that extends f|, and g(x) = 0 for every 
x £U. The set {x ER" : f(x) £ g(x)} is contained in the union of U—E, A, and 
(E—A)-—K, and each of the three sets has Lebesgue measure less than €/3. 

Tf || flloo < 00, and ||g||o0 > |Iflloos we modify g as in the proof of lemma 8.7.4. to 
satisfy the requirement ||g||o. < ||flloo- 


Theorem 8.7.6. For 1 <p < co, C,(IR") is dense in 2(IR") for all 1 < p < o. 


Proof. Let f € 2?(R"), and let € > 0. By lemma 8.7.3, we may assume that f= s, a 
simple function with A(supp(s)) < co. Lemma 6.7.4 produces a set A of measure 
less than € and a function g€ CR") such that s(x) = g(x) for x € A, and 
Iglloo < Ilslleo. Thus |g — | < |gl + |3| <2llsllo. Hence llg—s\lp = Si, lg—sIdu < 
2P||s||boA(A) < 2?||s|[boe. I 


Remark. Lemma 8.7.4 and theorems 8.7.5 and 8.7.6 are valid for Radon measures 


on locally compact Hausdorff spaces without any alterations to the proofs 


Observe that @ simply fixes S and retracts the annulus between the disks T and S radially onto the 
boundary of S. 
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included for the last three results. For example, in the proof of lemma 8.7.4, 
we only used the inner regularity of measurable sets of finite measure. 


Example 1. Let [a,b] be a compact interval in R. For 1 < p < 0, C[a, b] is dense 
in (a,b). 

Let f € 2°(a, b), and extend f to a function fe 2?(R) by defining f to be 0 
outside (a,b). By theorem 8.7.6, there exists a function g€ C,(IR) such that 
\lf-allp < ¢€. The restriction of gto [a, b] is in C[a, b], and ie f(x) — g(x) Pdx < 
Jig f— glPdx <6? 


Example 2. For a function f€ C|—7,7] (not necessarily periodic), and for 
every € > 0, there exists a 27-periodic function g such that || f—g|, <¢. Here 
l<p<o. 

The function g is be obtained by modifying f near +7 in the exact same 
manner as in the proof of lemma 4.10.2. @ 


Example 3. The space C(S') is dense in 2(—7z,7). Also, trigonometric 
polynomials are dense in 2?(—7, 77) for p € [1, 00). 


This result follows immediately from the last two examples and theorem 
4.10.1. 


We conclude this subsection by proving the following separability result. 
Theorem 8.7.7. For 1 <p < o, 8?(R") is separable. 


Proof. Let © be the collection of half-open boxes of the form o =[a,,b,)x...X 
[a,,b,), where a;,b; € Q. Define D to be the collection of linear combinations of 
characteristic functions of members of © with rational coefficients. Thus a member 
of D is a simple function of the form s = yo CiXo, where m EN, the coefficients 
c; are rational numbers, and o; € G. It is clear that D is countable. We prove that 
it is dense in 2°(1R"). In light of theorem 8.7.6, it suffices to show that if f € C7(R") 
and > 0, then there is a function s € D such that || f—s||, < ce for some constant 
c, which is independent of €. 

By the uniform continuity of f, there exists a number 6 > 0 such that | f(x) — 
SQ)| < € whenever ||x—y|| <6. Let Q be a box in © that contains supp(f) in 
its interior. Partition Q into disjoint sub-boxes 0j,..., 0%), where each o, € ©, 
and diam(a;) < 6. For each 1<i<™m, choose a rational number c; such that 
Mines, f(x) S ¢ < MaxxeG, f(x). Finally, define s = ae CiXo,- By construction, 
lf slleo <<. 

Now ||f—sllp = Sglf—sla < ||f—s|leo vol(Q) < €? vol(Q). 
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Approximation by C® Functions 


Definition. For Lebesgue measurable functions fand gon R”, the convolution of 
fand gis the function 


(resneer= [foe yeOray 
R2 
It is clear that if (f+ g)(x) is finite, then (f* g)(x) = (g * f(x). Thus 
(regen =f fey eondy= | forete—yay, 
Re Re 


A variety of conditions can be imposed on f and g to guarantee the finiteness of 
the integral, at least for a.e. x € IR”. We take for granted the measurability of the 


function f(x — y)g(y). 
In this subsection, we will limit the functions fand g to be continuous functions 


of compact support. The reader can look at the section exercises for a slightly 
expanded discussion of the properties of convolutions. 


Lemma 8.7.8. Let f,g € C.(IR"). Then 


(a) (f* g)(x) exists for all x € R", and 
(b) fegeC(R"), 


Proof. (a) Let K = supp(g). Then 
(#901 < [ Lf — ye) |4y < Ill ik le )Idy < Il flellelaoA(K) < 00. 
Rn Rr 


(b) Let F be the closure of the (bounded) set {x+y : x € supp(f),y € supp(g)}. 
We claim that f« g is supported inside F. If x € F, then, for every y € supp(g), 
x—y & supp(f). Thus fix — y)g(y) = 0 for all y € R"; hence (f * g)(x) = 0. 

Leté > 0. Since fis uniformly continuous, there exists a number 6 > 0 such that, 
for En ER", |f(§) —f()| < € whenever ||& — || < 6. Now, for such & and n, 


(fa — Fel < if IKE —y) —fl — plleIay <e i le) Idy 
n K 
<elell-o4(K). 


Lemma 8.7.9. Let f € 2?(IR"), where 1 < p < oo. Then lim, 9 ||Tf—Allp = 0- 


414 FUNDAMENTALS OF MATHEMATICAL ANALYSIS 


Proof. Recall that (t,f)(x) =f(x— a). First we prove the result for a function 
g€C.(R"). Without loss of generality, assume that |la|| < 1. Thus the functions 
Tag have a common support, say, K. Let € > 0. By the uniform continuity of g, 
there exists a number 6 > 0 such that whenever |al| < 6, then |(t,g)(x) — g(x)| = 
|g(x — a) — g(x)| < €. Now Zag — gllp = flex — 4) — g(x)|Pdx < €PA(K). 

Now let f € 2(IR"), and let € > 0. By theorem 8.7.6, there is a function g € 
C(R") such that || f—g||, < ¢/3. By the first part of the proof, there is 6 > 0 such 
that for \|a|| < 4, ||tTag— gl|p < €/3. Now if ||al| < 6, then 


f—Tafllp < ILf—gllp + Ile — Tagllp + Itag — Tafllp 
= ||f-allp + llg—Tagllp + llg—fllp < ¢. 


Definition. A multi-index a is a sequence a =(qj,...,a@,,), where each q; is a 
nonnegative integer. The length of a is the integer |a| = 5." 


jn Gi 


Notation. Let f be a scalar-valued function on R”, and let a = (q,...,a,) be 


me ; ies ale oa 
a multi-index. The notation Df stands for the derivative a if it 
Ox, Ox, * 0%," 
: ; at 
exists. For example, if n = 5, and a = (1,0,2,0, 1), then D*f= i ; 
8x x3 Ox5 


Definition. A function fis said to be infinitely differentiable if D%f exists for every 
multi-index a. The space of infinitely differentiable functions is denoted by 
e€*(IR"), and the space of infinitely differentiable functions of compact support 
is given the symbol C2(IR”). We will shortly see that there is an abundance of 
such functions. 


Example 4. Consider the function 


roe exp{—1/x} i > 0, 

0 ifx <0. 
It is easily seen that, forx > 0,andk € N, f (x) = p(1/x) exp{—1/x}, where p is 
a polynomial of degree 2k. Therefore lim, f(x) = 0. Hence f(0) = 0, and 
f is infinitely differentiable at x = 0. Since the differentiability of f at x 40 is 
obvious, fe C*(R). 


Example 5 (the bump function). For a fixed h > 0, consider the function 


—h . 
ne = ol raat if |x| < h, 
0 if |x| >A. 


As |x| t A, h? — |x|? | 0, so, by example 1, g € C®(R). 
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Example 6 (the bump kernel). We can use the function ¢ in example 5 to construct 
a continuously parameterized family of functions as follows. For a fixed h > 0, 


=" -h? ‘ 
A,h rene if ||x|| < h, 
if ||x|| > h, 


5, (x) = 


where A! = Fiver exp{ aha. By the above examples, 5),, € C°(R"). @ 
=lly 


Observe that max,epn 0, (x) = 6,(0) = A,h-"/e, and fy, 6,(x)dx = 1. 


The first assertion is obvious. For the second assertion, use the change of variable 
x = hy. By problem 14 on section 8.4, dx = h"dy, and 


7 =e 
[ 6,(x)dx = [a esp] =o fa 


—1 
=A" [ expf 1 birdy = 1, 
is P= prs" 


We call the family {5,, : h > 0} the bump kernel. 


Lemma 8.7.10. Iff € C2 (R"), andg € C,(IR"), then f * g € C2 (R”), and, for every 
multi-index a, D“(f * g) = D* fx g. 


Proof. The proof is by induction on ia. It is sufficient to prove the result when 
|a| = 1. Thus we need to show that male Q= 2 * g. For simplicity of notation, 
(SIX Xyy oe Xi Xp py Xp and vasnort as a function of the single variable x;, 
which we rename x. Thus we need to prove that — Lp xgv=f’ *g. 


We will show that lim,_,o 

Let ¢>0. By the uniform continuity of f, there is 6>0 such that 
LP E)-FA)| <e whenever |E —n| <5. Now [EEC PL _ (p 5. 2y(x9] = 
Lg EP — Px yheDdyl = [Ja fF + Ot—y) —P @— Mga 
where 0<@<1. Now if |t|<6, then |f(x+ 6t—y)—f(x—y)|<e and 


Sn {f (x + Ot — y)—f(x — whey) dyl < € Se Ig) Idy < Ellgl| ACK), where K = 
supp(g). 


Feghx+)-(eG) _ = 
; (f * g)(x) = 0. 


As a corollary of the last result, for every f € C,(R"), f* 6, € C2 (R”). 


The following is the C® version of Urysohn’s lemma. 
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Corollary 8.7.11. Let K be a compact subset of an open subset V of IR". Then there 
exists a function f € C2 ([R") such that K <f < V. 


Proof. Let 6 = dist(K,R" —V). Since K is compact, 6 is positive. Define K, = 
{x ER" : dist(x, K) < 6/4}, and V, ={x ER" : dist(x, K) < 6/2}. Since K, is 
compact, V, is open, and K, C Vj, theorem 8.4.3 produces a function g such that 
K, < g< V,. Now choose a number h < 6/4, and define f = g* 6,. By lemma 
8.7.8, supp(f) C {x ER” : dist(x, K) < 36/4} € V. 

Clearly, f(x)>0. Since O<g(x)<1, f(X)=Sgng(x—y)d,(y)dy < 
Fn diy) dy = 1. 
It remain to show that K <f. If x € K, then B(x,h) € B(x, 6/4) C Kj; hence 


f(x= Jix—yjcnS n(x —y)dy = Lice 6,(x—y)dy = 1.8 
Proposition 8.7.12. If f € C.(IR”), then, for 1 < p < cw, fx 6), > fin R?(R"). 


Proof. We will make use of the fact that fy,O,(y)dy =1 and example 4 on sec- 
tion 8.6: 


I+ 56) -s091=| [ Gl» -fe8, or) 
Ra 
< [ ke) -s0018,006 
R2 


1/p 
<([ Ye-9-sooreiony) 
IR. 


Integrating the p"" power of the extreme sides of the above string, we have 


IIf* 5, Alp S [ [Ax — y) — FOP On, )dydx 


nJRn 


= [ 80 | We—-»-seopanay 
Ra R12 

= : IIf—Allpoiy)dy = [ II.f—Alpon(y)dy."* 
Ra 


IIyl|sh 


By lemma 8.7.9, there exists a number hg > 0 such that, for ||y|| < ho; ||t,f—Ffllp < 
€. Hence, for h < hg, Siyich IIz.f—Alpo. ray <é Jivich 6,(y)dy = €?. Hi 


Part (a) of the following result is a vast generalization of theorem 8.7.6. 


Theorem 8.7.13. (a) For 1 <p < co, C(R") is dense in 2?(IR"). 
(b) CS(IR") is dense in C)(IR"). 


° Fubini’s theorem is used below to switch the order of integration. 
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Proof. (a) Let fe 2?(IR"), and let € > 0. By theorem 8.7.6, there is a function 
g € CR") such that || f—gl||p < ¢/2. By the previous proposition, we can choose 
h > 0 small enough so that \|g— g * 5y||p < €/2. The function gx 5), is in C2 (R"), 
and ||f—g* dilly <€- 


(b) Let fE Cy(R"), and let €>0. By theorem 5.11.8, there exits a function 
g€C(R") such that ||f— glo <¢€. By the uniform continuity of g, there is a 
number 5 > 0 such that |g(x) — g(y)| < € whenever ||x — y|| < 5. Choose a positive 
number h < 6. The proof will be complete if we show that ||g * 5, — glo < €. Since 


Jos YN = 1 
54,00) -s001=| [ 0)- 00} yay 
R« 
<[_1s6)-ebiI6iGe—y)ay 
Ilx-yl|<6 


<e | 6,(x—y)dy =e. Hi 
llx—yl|<h 


Exercises 


1. Let K be acompact subset of a locally compact Hausdorff space X, and let f : 
K = [0,1] be continuous. Show f can be extended to a continuous function 
g © C(X) such that 0 < g< 1. If Kis contained in an open set U, then gcan be 
constructed in such a way that supp(g) € U. Hint: Mimic the proof of lemma 
8.7.1, and use theorem 5.11.5. 

2. Let F be a closed subset of normal space X, and let f : F > [0,1] be contin- 
uous. Show f can be extended to a continuous function g€ C(X) such that 
0 <g <1. If F is contained in an open set U, then g can be constructed in 
such a way that supp(g) € U. Hint: Modify the proof of lemma 8.7.1, and use 
theorem 5.11.2. In this case, convergence of the functions G; takes place in 
the space BC(X). 

3. Let uw be a o-finite measure on _X, let (f,,) be a sequence of measurable func- 
tions, and suppose that lim, f(x) = f(x). Prove that there exists a sequence 
(E;) of measurable sets such that f,, converges uniformly to fon each E; and 
M(X— UR E}) = 0. 

4. Let fe 2°(R"), where 1 <p < oo. Show that the mapping R” > 2?(R") 
defined by a > 1,f is uniformly continuous. Observe that, in lemma 8.7.9, 
we established the continuity at a = 0. 

5. Assuming that (f* g)(x) exists for every x € R", prove that 
(a) fxg=gxfand 
(b) t.(f* g) = (Taf) * 8 = f* (Tag). 
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6. Let fe L?(IR") and g€ 24(IR”), where p and q are conjugate exponents. 
Prove that fx g € 2©(R") and that || f* glloo < |Ifllpliglla- 

7. This problem is a continuation of the previous exercise. Prove that if 1 < p < 
oo, then f* g is uniformly continuous. 

8. This problem is a continuation of exercise 6. Show that if 1 < p < oo, then 
fe ge C(R"). 

9. Prove that dim C2(IR") = oo. 


8.8 Product Measures 


Throughout this section, (X, MW, 4) and (Y, It, v) denote a pair of measure spaces. 
The objective of this section is to find a reasonable definition of the product 
measure on X X Y. Fubini’s theorem is one of the section’s main results. We also 
settle questions about the products of Lebesgue measures in this section. 


The basic definitions are motivated by the ideas found in standard calculus 
textbooks. Let us look at the simplest case, which is the product of two copies 
of the real line with Lebesgue measure, A. The problem of computing the area of 
a plane region contains all the motivations for the ideas behind the definitions in 
this section. Figure 8.6 depicts a (bounded) plane region E in R?. To compute the 
area of E, we take a vertical cross section S, in E, and the area of E is obtained 
by integrating the length (the Lebesgue measure) of the cross section. The same 
can be achieved by taking a horizontal cross section S” in E. Thus the area (two- 
dimensional measure) of E, denoted e(E), is given by 


Figure 8.6 Computing the area of a plane 
region 
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pe)= f a0s,ax= [ as». 
R R 


We also wish the two-dimensional measure ¢ to preserve the property that the 
area of a rectangle is the product of its dimensions. More generally, if A and B are 
measurable subsets of R, then it should be the case that 


P(A X B) = A(A)A(B). 
Now see theorem 8.8.9, where the definition of the product measure appears. 


Before we can achieve any of the above goals, we need to define a reasonable o- 
algebra in X x Y where our expectations can materialize. Geometry dictates that 
the product of two intervals (or, more generally, measurable subsets) A and B in 
R ought to be measurable in the product space. This immediately suggests that we 
look at the smallest o-algebra that contains all rectangles, and this provides the 
motivation of the definitions below of the product of measurable spaces. 


Products of Measurable Spaces 


Definition. A subset of X x Y of the form A X B, where A € M, B E Mis called a 
measurable rectangle in X x Y. 


Definition. The product of the measurable spaces (X, Ji) and (Y,9t) is the 
measurable space (X x Y, I @ XM), where M@ @ N is the o-algebra generated 
by the collection of measurable rectangles. 

Definition. For a subset E C X x Y, and for a fixed element x € X, we define the 
x-section of E to be the set E, ={y € Y : (x,y) € E}. Similarly, for y € Y, the 
y-section of E is the set E” = {x € X : (x,y) € E}. 

The following lemma will be used without explicit reference. Its proof is simple. 

Lemma 8.8.1. If (E,,) is a sequence of subsets of X Xx Y, then 

(U,En)x = UnlE,) ? and (En) = NnlEn) x 


The corresponding statements for y-sections are true. 1 


Proposition 8.8.2. If EG IM @ MN, then, for every x EX, E, EN. Likewise, for 
everyyEY,P em. 
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Proof. Let Q={ECXXY: E, € MR Vx € X}. Clearly, QO contains all elementary 
rectangles because if AE M and BEN, then 


Cee B ifxeA, 
QD ifx€A. 

IfE €Q, (E’), = (E,)’ € 9 hence E’ € QO. Here E' denotes the complement of E 
inX XY. 

Finally, if for every nEN, E, €Q, (URL, E,), = U1 (E,), € M. Therefore 
UPL EF, €Q. 

The above shows that Q. is a o-algebra that contains all measurable rectangles; 
hence QD IM@MN. The proof that EX € M for every y € Y is identical to the 
above case. @ 


Definition. Let fbea scalar function on X x Y. For an element x € X, the x-section 
of fis the function f, : Y > C defined by f,(y) = f(x, y). Similarly, the y-section 
of fis the function f’ : X > C such that f’(x) = f(x,y). 


Proposition 8.8.3. If f : Xx Y— C is M @ R-measurable, then, for every x € X, 
f, is N-measurable, and, for every y € Y, f’ is M-measurable. 


Proof. Let aE R, and let E=f '(a,o)={(x%,y)EXXY: flx,y) > a}. Now 
fx \(a, 00) is exactly the set E,,, which is measurable by the previous proposition. 
Thus f, is N-measurable. @ 


Definition. An elementary set in XX Y is a disjoint union of finitely many 
measurable rectangles. The collection of elementary sets will be given the 
symbol &. 


It is clear that the collection of elementary sets also generates MN @ Me. 
Proposition 8.8.4. The collection € of elementary sets is an algebra. 


Proof. It is clear that the intersection of two measurable rectangles is either empty or 
a measurable rectangle. Also (A X B)’ = (A' X Y) U(A XB’), so the complement 
of a measurable rectangle is an elementary set. 

Let E = Uj-,R; and F = U;_ |S; be elementary sets, where each of {R;} and {S;} 
is a set of disjoint measurable rectangles. Now ENF = URNS): 1<i<n,1s< 
j <m}. This shows that ENF € &, and that © is closed under the formation of 
finite intersections 
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Now consider the complement of an elementary set E = Uj_,R;. Since E’ = 
Nj=1R;, E’ € &. Thus € is closed under complementation. It is also clear that € 
is closed under the formation of finite disjoint unions. 

Now if E,,E, € ©, then, by the above, E, NE, € ©; hence E, UE, = E, VU 
(E,/NE£,)€C. 


Definition. Let T be a nonempty set. A monotone class in T is a collection © of 
subsets of T such that 


(a) if (E,,) is an ascending sequence in G, then UE, € ©; and 
(b) if (E,,) is a descending sequence in ©, then NE, € ©. 


Proposition 8.8.5. Given an arbitrary collection € of subsets of a nonempty set T, 
there exists a (unique) smallest monotone class in T that contains €. 


Proof. The intersection of an arbitrary collection of monotone classes in T is clearly 
a monotone class. The family of monotone classes containing E is nonempty since 
P(T) is such a monotone class. The intersection of the monotone classes in T that 
contain E is the monotone class we seek. 


Lemma 8.8.6 (the monotone class lemma). Let & be an algebra of subsets in a 
nonempty set T. Then the smallest monotone class in T containing © is the o- 
algebra generated by ©. In particular, if an algebra in T is a monotone class, then 
it is a o-algebra. 


Proof. Let M be the o-algebra generated by &, and let IN, be the smallest monotone 
class in T containing &. Since M is a monotone class containing E, M, C Me. 
Thus we need to establish the reverse inclusion. It is clearly sufficient to show that 
Mi, is a o-algebra. 

We first show that IN, is an algebra. Let M', ={E CT : E' © Mj}. It is clear 
that IN}, is a monotone class in T and that € C Me}. Thus M, C WM; hence M, 
is closed under complementation. 

For a member F € My, define Q(F) ={E € M, : EVFE My}. It is easy to 
verify that Q(F) is a monotone class in X. Now if GE &, then Q(G) contains 
&, so Q(G) = M,. Hence, for any HE M,, H € O(G). By the very definition of 
Q(H), G € OCH), so € € OCH) for each H € M,. Because Q(H) is a monotone 
class, IM, = QCH), so My, is an algebra. 

Now if (E,,) is a sequence of members of WM, let B,, = Uf, E;. Because M, is an 
algebra, each B,, € MN). Since IN, is a monotone class, it follows that UF E, = 
Ure. B,, is in My. This shows that M, is a o-algebra, and the proof is complete. Hl 


The following result is immediate. 
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Corollary 8.8.7. The o-algebra M @ N is the smallest monotone class that contains 
the algebra © of elementary sets. 


Product Measures 


Theorem 8.8.8. Suppose (X, M, 4) and (Y, N,v) are o-finite measure spaces. 
For a subset EE M @ MN, and for x € X,y € Y, define 


p(x) = vV(E,,), and p(y) = u(E’). 
Then 


(i) v is M-measurable, 
(ii) tb is N-measurable, and 
(iii) fp pdu = fy pdv. 


Proof. Let Q be the collection of members of IN @ MN for which all three conclusions 
of the theorem hold. We will show that Q = M@ MN. 


First, we establish a number of facts. 


(a) Q contains all elementary sets. 

If E=AXB is a measurable rectangle, then v((A X B),) = 7%4(x)v(B), and 
M(A X BY”) = xe(y)u (A), are measurable; hence f, v(E,)du = f, V(B)X4 du = 
v(B)MA) = f(A) xpdv = J, MCE”) dv. Now the result is true for elementary sets 
because of the additivity of measures and the linearity of integrals. 


(b) IfE, € Q and E, CE, C..., then E= URE, €Q. 

Write Q(x) = V(En)xd> PnQ) = MCEnY), 9) = VE), and py) = uP). 
Now @,(x) increases to p(x) = V(E,), and tp,(y) increases to p(y) = u(E’). 
By assumption, ~, and tp, are measurable, so @ and w are measurable. Also 
by assumption, fy Qndu = f,,dv. By the monotone convergence theorem, 
conclusion (iii) holds for the set E. 


(c) If E, DE, 2... is a sequence in O and if E, C AX B for some measurable 
rectangle A X B with (A) < co and v(B) < oo, then E=N?_,E, € QO. 

In the notation of the proof of fact (b), p,, decreases to p, and tp,, decreases to w. 
Thus ~ and w are measurable. Since (E,), C (A X B),, V(CE,)x) < v((A X B),.) = 
V(B)xa(x). Therefore fy oidu = fyv(Ev)xdu S fy v(B)xXadu = u(A)v(B) < 
oo. Similarly, [dv < oo. By assumption, J ~,du = f,,dv. Fact (c) now 
follows from the dominated convergence theorem. 


™ Recall that characteristic functions of measurable sets are measurable functions. 
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(d) If (E,,) is a disjoint sequence in Q, then E= US, E, € Q. 
For eachn EN, the set Uj E; is in O, (see the proof of fact (a)). Now (d) follows 
from (b) applied to the ascending sequence (Uj, E;) 7-1. 


Now we use the o-finiteness assumption to write X as the disjoint union of subsets 
X,, of finite -measure, and Y as the disjoint union of subsets Y,, of finite v 
measure. For a member E of M@ MN, define Ey, = EN (XX Yin), and let QO, be 
the collection of all members E of IR @ N such that, for allm,n EN, En € Q. 
Facts (b) and (c) imply that Q, is a monotone class, and fact (a) implies that QO, 
contains all elementary sets. Thus QO, = MN @ MN by corollary 8.8.7. 

Thus En y € QO for every EEC M @ M and for all m,n € N. Since E= Vy nEmn 
and the sets E,,, are disjoint, fact (d) implies that E€ ©. 


Observe that conclusion (iii) of theorem 8.8.8 can be written as 


[{[coneorfance = [{ [cee nrauco}aron, 


Thus the order of integration can be switched in iterated integrals of characteristic 
functions of M @ N-measurable sets. This is clearly the first step to prove Fubini’s 
theorem. First we need to define the product measure of two o-finite measure 
spaces. 


Theorem 8.8.9. Under the assumptions of theorem 8.8.8, the set function defined by 


warne)= [eau | par 
x Y 


is the unique positive measure on M @ N such that (u@v)(A x B) = u(A)v(B) 
for all measurable rectangles A x B. Furthermore, 4 @ v is o-finite. The measure 
M®@ v is the product of the measures fi and v. 


Proof. Let (E,,) be a disjoint sequence of M @ Yt-measurable subsets of X x Y, and 
let E = UnzyE,,. Since E,, = Un, (E,),. and since the sequence ((E,,),) is disjoint, 
V(E,.) = D1 V(En)s). An application of the monotone convergence theorem 
yields 


foe} 


wane = [YwEdod=> | WE dodu= LY UONE, 
x 


Xn=1 n=1 n=1 


‘The o-finiteness of 4 @ v is obvious. We leave the proof of the uniqueness part as 
an exercise. 


424 FUNDAMENTALS OF MATHEMATICAL ANALYSIS 


Remark. Both the existence and uniqueness of the product measure of o-finite 
spaces can be based on the Hopf extension theorem. For a measurable rectangle 
AXB in MxM, we define e(A x B) = u(A)v(B), and, for an elementary set 
C=UL)A; x B;, we define e(C) = DHA) V(Bi). Then one can check that 
p is countably additive on the algebra © of elementary sets (it is not difficult). 
Now all the conditions of theorems 8.2.19 and 8.2.20 are met, and the (unique) 
Hopf extension of p is the product measure 4 ® v. The approach we took to 
define the product measure has the slight advantage that it is better motivated 
by calculus concepts, as explained in the opening remarks of this section. In 
addition, Fubini’s theorem follows without difficulty from the above results. 


Fubini’s Theorem 


Theorem 8.8.10 (Tonelli’s theorem). Suppose f: Xx YC is an M@MN- 
measurable function. 


(a) If f is positive, let o(x)= fi f.dv, and piy)= f,f'du. Then ¢ is M- 


measurable, w is ¥t-measurable, and 


fooau= [ faue@r= | pr. 
x x Y 


xY 


(b) In general, let 9*(x) = f,|fledv, and b*(y) = Syl fP'du. If o* € L(y) or if 
p* € Q'(r), then fE W(U@ v). 


Proof. Tonellis theorem holds for the characteristic function of an M @ N- 
measurable set by the previous theorem. By the linearity of the integral, Tonelli’s 
theorem holds for any IN @ N-measurable simple function. 

Now let0 <s, <s. <... be a sequence of M @ N-simple functions converging 
to f(x,y) for every (x,y) EX xX Y, and let 9, (x) = fy(s,),dv. By the above para- 
graph, fy Prd = SyyySn4(U ® v). The monotone convergence theorem implies 
that f pdu = fy, yfd(u @ v). The proof that f,pdv = fy, y fd(u ® v) is identi- 
cal to the above. 

Part (b) is obtained by applying part (a) to the function |f|. @ 


Theorem 8.8.11 (Fubini’s theorem). If f€ 2'(u@v), then f, € 2'(v) for ae. 


xe X, f © Q'(w) for a.e. y € Y, the functions p(x) = f,f.dv and p(y) = f,. f'du 
are in &'(u) and &'(v), respectively, and 


fooae= [ saqe@r= | gr. 7) 
xX XXY 4 
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Proof. It is clearly sufficient to prove the result when f is a real function. Let ft 
and f~ be the positive and negative parts of f, and write \(x) = J," )xdv, and 
$2(x) = Sf )xdv. Since ft <|f|, ff © B'\(u@ v), theorem 8.8.10 applies and 
Se Pid = Soy fo du @ v) < 00. Thus 9p, € 2'(u) and example 1 in section 8.3 
now implies that p,(x) is finite for a.e. x € X, that is, (f*),. is integrable for a.e. 
x € X. Similar results apply to f~; ~, € B'(u), and gy is finite for a.e. x EX. 
The function ~ = 9, — » is defined for a.e. x € X, and the identity J, pdu = 
Sxyfd(u @ v) follows from the fact that f, = (f*).—(f). and the linearity of 
the integral. The remaining assertion of the theorem and the other identity in (7) 
are obtained by replicating the above proof for the function f’. 


Products of Lebesgue Measures 


In the discussion below and until the end of the section, k is a positive integer, and 
A, denotes Lebesgue measure on the g-algebra L* of Lebesgue measurable subsets 
of R*. We also use the notation B* to denote the o-algebra of Borel subsets of R*. 


In the following, we use the result of problem 10 on section 8.4 without explicit 
mention. 


Lemma 8.8.12. Let r and s be positive integers, and let n=r+s. If Z is a set of 
Lebesgue measures 0 in R' and BE £*, then ZX BEL", anda,(Zx B) = 0. 


Proof. First assume that B is bounded, and choose an open set V of finite measure 
such that BC VCR‘. Let € > 0. Choose an open set U such that ZG UCR’ 
and A,(U) < . Since we have not yet established the Lebesgue measurability of 
ZX B, we estimate its outer measure: m;(Z XB) <m3(UX V) =A,(UX V) = 
A(U)A,(V) < €A,(V). Since € is arbitrary, mi(Z x B) = 0; hence ZX B is mea- 
surable of measure 0. 

If B is unbounded, consider the intersection B; of B with the open ball in R° 
of radius i and centered at the origin. By what we just proved, for each i EN, 
ZX B, € £" has measure 0. Since ZX B = UP ,(Z X B;), the proof is complete. Hi 


Proposition 8.8.13. Let r, s, and n be as in lemma 8.8.12. Then 


(a) B"CL'@LSCL". 
(b) IfAEL' and BEL‘, then2,(A XB) =A,(A)A,(B). 


Proof. (a) Every open cube in R" is the product of two open cubes, one in R" and 
one in R*. Thus £L’ @ L* contains all open cubes in R". Since every open subset of 
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IR” is a countable union of open cubes, £’ ® L* contains all open subsets of R”; 
hence B" CL’ @L*. 

Suppose AE L" and BE L*. By theorem 8.4.10(c), choose Fz sets FCA 
and K C B such that the sets Z, = A— F and Z, = B— K have measure 0. Now 
AXB=(FXK)UZ, where Z = (FX Z,)U(Z, XK) U (Z, X Z,). By the previous 
lemma, 1,(Z) = 0. Since the product of F, sets is an F, set, FX K € £". The L"- 
measurability of A X B is now immediate because it the union of two measurable 
sets. We have shown that £" contains all measurable rectangles in L" @ L*. Hence 
LT@LSCL". 


(b) First assume that A and B are bounded Gs sets. Thus there exist descending 
sequences of bounded open sets {U;} in R’ and {V;} in RS such that A = 9;2,U; 
and B= 2, V;. Now 


Ay(A XB) = A,(NE, NE, (U; x V;)) = lima, (M2 (U;x V))) 


= limlim4,(U; x V;) = limlim4,(U,)A,(V;) = 4,(A)A,(B). 
ij ij 


Now, for arbitrary (unbounded) Gs sets A and B, the result follows from the o- 
finiteness of Lebesgue measure. We invite the reader to work out the details. 


Finally, if A and B are Lebesgue measurable in their respective spaces, then, by 
theorem 8.4.10(c), choose Gs sets G in R" and H in RS such that A= G—Z,, 
B=H-—Z,, where Z, and Z, have measure 0. By lemma 8.8.12, 


A, (GX H) =A,(A X B)+A,(A XZ) +A,(Z, X B) +1,(Z, X Z2) = 1,(A X B). 
But A,,(G X H) =41,(G)A,(A) = 1,(A)A,(B), hence the result. 


Before we proceed to the next theorem, we will show that 1, ® A, is not a complete 
measure. Let E be a subset of [0,1] that is not Lebesgue measurable. The set A = 
Ex {0} C R? is contained in B = [0,1] x {0}, which is in £! @ L!. Clearly, (A, @ 
2,)(B) = 0. However, A is not in £! @ L' by proposition 8.8.2. As a by-product of 
this example, it follows that £? is strictly larger than £'@ L'. 


Theorem 8.8.14. Let rand s be positive integers, and letn =r+s. Then(R",£",A,) 
is the completion of (R", £7 @ £°,A, @A,). 


Proof. By the above proposition, if AGL’ and BEL‘, then 1,(AXB)= 
A(A)A,(B) = (A, @A,A XB). Thus A, agrees with A,@A, on the set of 
measurable rectangles in £' @ L°. By the uniqueness of the product measure 
(theorem 8.8.9 and problem 5 at the end of this section), A, extends A, ®A,. 
Since (IR",£",A,) is a complete measure space, it contains the completion of 
(R",L' @L°,A, @A,). 


INTEGRATION THEORY 427 


The proof will be complete if we show that for a member E of £", there are members 
A and B of £'@ £* such that A C EC B, and (A, @A,)(B — A) = 0; see problem 
4 on section 8.2. By theorem 8.4.10, there exists an F, set AC IR" and a Gs set 
BCR" such that A CE C B, andd,(B— A) = 0. Since A,B € B" C L'@ L*, the 
above paragraph implies that (A, ®@A,)(B — A) =2,(B—A) = 0, as desired. @ 


Excursion: The Product of Finitely Many Measures 


It is clear that the above definitions and constructions for the product of two 
measurable spaces can be extended to the product of any finite number of 
measurable spaces {(X;, W;), 1 <i <n}. A measurable rectangle is a set of the form 
A,X..XA,, A; © M;, and an elementary set is a disjoint union ofa finite number 
of measurable rectangles. It is easy to see that the collection, ©, of elementary 
sets is an algebra. By definition, IN, @...@ M,, is the o-algebra generated by 
the collection of measurable rectangle. Obviously, the algebra © also generates 


M, @...@M,. 

We first establish the following technical lemma 

Lemma 8.8.15. Let (X;, M,),1 <i < 3 be measurable spaces. Then 
M, @ (M, @M,) = M, @M, @ Ms. 


Proof. Recall that M, @ IM, x Mt, is generated by the set of all measurable 
rectangles A, X A, X A3, where A; € M;, while M, @ (IM, @ M;) is generated 
by sets of the form A, @P, where Ay EM, and PEM, @ Ms. We show 
that M@, @ M, @ Mz CM, @ CM, @ M3). Every measurable rectangle R= 
A, XA, XA; in X, XX, x X3 can be written as R= A, X P, where P= A,X A3. 
Since PEM, @ M3, Me, @ CM, OM) contains all measurable rectangles; 
hence M, ® (IM, @ M;) DM, BM, @ Ms. 


To prove the reverse containment, it is enough to show that M, @ M, @ M; 
contains every set of the form A, X P, where A, € M, and PE M, @ Ms, which 
is a generating set for M@, @ (M, @ Ms). 

Define a collection of subsets of X, x X; as follows: 


Q={PCX,xX;, > XxX PEM, @M, @ M,}. 


It is easy to see that Q is a a-algebra and that every measurable rectangle A X Az 
is in Q. Thus Q contains the o-algebra generated by all elementary rectangles in 
XX X3; hence ADM, @ Ms. It follows that, for every PEM, @ Ms, X; @ 
PEM, @ M, @ Ms. It is clear that, for every A, € M,, Ay XX, XX; EM, @ 
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M, @ Ms; hence the intersection of X, x Pand A, X X, XX; is in M, @ M, @ 
IM. But the intersection of the latter two sets is exactly A, X P. This concludes the 
proof. @ 


By an argument almost identical to the above proof, it can be shown that 
(Mi, @ Mi.) @ M@;, = Me, @ M, @ Ms. Thus the formation of products of 


measurable spaces is associative. 


It follows by induction that if {(X;, W;),1<i<n} is a finite set of measurable 
spaces then 


M,®...@M, =M, @R,_;, where R,_, =M,@...@M,,. 


This immediately suggests an inductive definition of the product of more than two 
measure spaces. 


Definition. Let {(X;, Mt,,u;),1 <i<n} be a set of o-finite measure spaces. We 
define the product measure 4; @... @ M,, on M@, @...@ M,, = M, @ R,,_; by 


My ® ... @ Mn = Hy ®@ Pn—1 » where Py) = fy @... @ Mn. 


Theorem 8.8.9 and the inductive nature of the construction imply that 1; @ ...® 
H,, isa o-finite measure on Mt, ®... ®@ Mt,, and that, for a measurable rectangle 
A, X...X Aj) we have ( ® ... @ My (A X...XA,) = []_, HAD 


Theorem 8.8.16. Suppose that {(X;,M;, Uj), 1 <i<n} is a set of o-finite measure 
spaces. Then the product measure [, ®...® |, is the unique o-finite mea- 
sure on M, @...@ Mt, such that, for every measurable rectangle A, X...X Ap 


(1 @ -- @ My MAr XX An) = TT, Ai). 


The existence of the product measure is by the inductive construction outlined 
before the statement of the theorem. The uniqueness of 1; @ ... ® LU, is by problem 
5 at the end of the section. 


Fubini’s theorem (theorem 8.8.11) extends to the product of any finite number of 


measures in a straightforward manner. Using the notation we established earlier 
in this excursion, if fE 2'(u,; ®...@ M,), then 


[ fd(y @ .. @ Mn) = [ i fdp,—4 fy. 
XyXiXXq X, JX)x...xXy, 
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The repeated application of Fubini’s theorem (induction) yields 


tl. 
2. 


3. 


[ fa, @ ®t) = ff of faymtadt 
X1X...XXq X, YX) Xn 


Exercises 


Let r and s be positive integers, and let n = r+s. Prove that B” = B” @ B*. 
Let r, s, and n be as in problem 1. Prove that £” @ L° is strictly contained 
in £”. 

Let T : R” > R” bean invertible linear operator. Prove that, for a function 
fthat is either positive or integrable, 


1, fda = |det(T)| i (foT)da. (x) 
Ra Ra 


Hint: Let A be the matrix of T relative to the standard basis of R”. By 
theorem B.4 in Appendix B, A is the product of elementary matrices. Prove 
that (*) holds for linear mappings generated by elementary matrices. You 
need Fubini’s theorem and a specialized version of problem 15 on section 
8.4. Observe that a useful by-product of this exercise is that if A is an 
orthogonal matrix, then, for all E€ £", A(E) = A(AE), where AE = {Ax : 
x € E}. Thus, the Lebesgue measure is rotation invariant. 


. Prove that a proper subspace of IR” has Lebesgue measure 0. Hint: See 


problem 6 on section 8.4. 


. Complete the proof of theorem 8.8.9. Thus prove that if o is a measure on 


M ®& MN such that for A € M, and BE M, p(A x B) = w(A)v(B), then p = 
H@von M @ RN. The same result easily extends to the product of any finite 
number of measures. 


. Let f : R? > R’ be the function 


1. ateedecsysx+1, 
fKuy) = —1 ifx>0,.x+1<y<x+2, 


0 otherwise. 


Prove that i. bite Ka y)dxdy # fia yao (x, y)dydx. This does not con- 
tradict theorem 8.8.11 because, clearly, | f| is not integrable. 


. Let X=[0,1], IN be L!-restricted to [0,1], and dx (or dy) denote the 


Lebesgue measure on [0, 1]. Choose a sequence a < a <... in (0,1) and, 
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for each n > 1, let g,, be a continuous positive function that vanishes outside 
[@n,%,41] such that yas g,(x)dx=1. Define a function f: R’?-R by 


A.) = Dr Si Gi) — Sr4i@)]- Show that ff ff flxydedy # J J 
ftx.y)dydx. Also prove directly that | f| is not integrable on the unit square. 


1 pl x-y* = 1 pl x-y? _t ; : 
8. Show that f) J, Gay OY = =f. Jy ay = 7. By integrating 
ver) 
the positive part of f= ae yz on the unit square, show directly that f is 
x+y 


not integrable on the unit square. 
9. By integrating e~” sin 2xy on the strip [0, 1] x (0, co), show that 
i ~e-) sin® ydy = log(5)/4. 

10. Let (X,, M,),1 <i <n} bea finite set of measurable spaces. Show that the 
complement of a measurable rectangle in X, X ... x X,, is an elementary set. 
This fact is needed in the proof that the collection of elementary sets is an 
algebra. 


8.9 A Glimpse of Fourier Analysis 


This section has a number of axes. We extend the discussion of Fourier series of 
27-periodic functions we started in section 4.10. We also study Fourier series of 
functions in 2?(—7., 77). Then we take a brief tour through the Fourier transform. 
Finally we take a last look at the orthogonal polynomials we encountered in 
section 4.10. 


Fourier Series of 277-Periodic Functions 


In section 4.10, we looked at the sequence of partial sums S,f of a 27-periodic 
function f. The first tool we develop is an integral formula for S,f: Using the 
notation of section 4.10, 


n n ZU: 
sa) = Yo Ape = Y Leis i oI 
j=rn j=r-n -71 
T 7 n 
== (5 as) |) dt. 
—7 \j=—n 


We define the Dirichlet kernel to be the sequence of functions 


D,(x) = Ds eli*, 


jean 


Then the above calculation yields 


S,f(x) = = [ f)D,(x— Dat = = (f* D, (x). 
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Observe that D,,(x) = 1+ a (el +e #*) = 142 bee cos(jx). Multiplying the 
two sides of last identity by sin(x/2) we obtain 


sin(2 )p, (x) = sin(3 )+2Qsin(3 )eos jx) 


= sin( >) 2 sin( i+ = )#—sin(j- =a = sin(n + =) 


from which we obtain the formula for D,,(x) in closed form: 


sin(n+ = se 


D,(x) = z 


sin= 
2 


The Dirichlet kernel is clearly an even, 27-periodic function, and D,(0) = 2n + 1. 
Since sin(x/2) > 0 on the interval (0, 7), D,,(x) has cas roots at the roots of the 
function sin(n + 1/2)x, namely, xj = — al ee 

2n 


The graph of Dj, appears in figure 8.7. 


Example 1. We derive the following estimate of ||D,,||,: 


ys 
1 4 1 
IDalls= 5 J WPaGlde> 2D . 


: PAPY re 


-Tl 0 


Figure 8.7 D9 
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Since 0 < sin(x/2) < x/2 for x € (0,7), 


joes ” \sin(n+ V/2)x ™ \sin(n + 1/2)x| iF 
in 0 . a 0 


sin~ x 
2 
(n+1/2)z |, nu). n ka ‘ 
2 sin x 2 sin x 2 sin x 
ay | es ‘| | ee ay [A 
T Jy x m Jy x A du-n2 * 
ka 
pe 4unl 
>= binslde= SEA 
m4 kr (b=) 7 re k 


We are now ready to prove that the Fourier series of a continuous, 27r-periodic 
function f need not converge pointwise to f. 


Theorem 8.9.1. There exists a function f € C(S') such that S,,f(0) does not converge 
to f(0). 


Proof. We prove that, for some continuous, 27-periodic function f, the sequence 
S,f(0) is unbounded. For each n EN, define a functional 2, on C(S') as follows: 
A,Cf) = S,f(0). Then 


BOIS i (AOD, (at < Hees i: [Dj (Olde = ID, lll flee: 


It follows that | A 
the function 


. We show that ||A,,|| = ||D,||;. Let € > 0. Consider 


if D,(x) 2 0, 


1 
fa)= iz if D,(x) <0. 


Observe that f(x)D,,(x) = |D,,(x)| for all x © [—7, 2]. By example 3 in section 8.7, 
there exists a function g € C(S') such that || f—g||, < ¢. Now 


aa / _Duladetedde— i : ID, @ldx| = | [ _Daleo) ~ Dy(oyflex 
: al DIFC) — g@)|dx < [[Dylloollf= sll <€llDaloo- 


It follows that |A,,(g)| = Fame D,,(x)g(x)dx| = ee 7 ID, (x)|adx — €||Dlloo- 
Since € is arbitrary, ||Aq|| = ||Dulli. 
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In particular, the sequence of functionals (A,,) is not uniformly bounded, since 
Anll = ||Dull, > co as n > co. By the Banach-Steinhaus theorem, A, cannot be 
pointwise bounded. Thus there exists a function f € C(S') such that sup{|S,f(O)| : 
nNENS=supi|A,()|: nEN}=oco. i 


We now prove another classical theorem about the convergence of the means 
of the sequence of partial sums of the Fourier series of continuous, 27r-periodic 
functions. 

For a function f € C(S'), consider the trigonometric polynomial 


(o,f\(x) = Sof(x)+.. FSif)_ 


n+1 


Since S,f(x) = — f* D(x), f(x) = —fe K,,(x), where 


K,(x) = —— (Do +-. .+D,). 


The function K,, is known as the Feijer kernel. We derive a formula for K,, below. 
Form the formula for the Dirichlet kernel, we have 


me er ree 
(n+ L)sin(5) K(x) me (i+ 5)* 
Thus 


(n+ sin? (5) K(x) — Sisin(2) sin (i+ 5) 
j=0 


_ = 19 (cos jx—c0s (+x) = = 5a ~cos(n + 1)x) = sin 2 (eee), 


j=0 


Hence 


sin?( tis) 


(n+ 1)sin2( ) 


: oye . 7 
Clearly, K,, is an even, positive, 27-periodic function, and since J”, D(x)dx = 27 
for all j EN, J, Ky(x)dx = 27 for alln EN. 


K,(x) = 


The following property of K,, is crucial for the next theorem: For 6 € 
(0, 7), limy.99 max{K,, (x) : 6 < |x| <72}=0. This is because if 5 < |x| < z, then 
sin?(x/2) > sin?(6/2), and hence 0 < K,,(x) < aes => 0asn— oo. 


Theorem 8.9.2 (Feijer’s theorem). For f € C(S'), lim, ||o,f—flloo = 0. 
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Proof. Since f* K,, = K,,*f, itis more convenient here to write o,f(x) = = SC fe 7 
t)K,(t)dt. Let €>0. By the uniform continuity of f, there exists a number 
5 > 0 such that if |t| < 6, then | f(x — t) — f(x)| < € for all x € [—7, 7]. Choose a 
natural number N such that, forn > N, max{K,(f) : 6 < |t| < 2} <e. Recall that 
SK, (Ddt = 27. Now, forn> N, 


lo,f(x) — f(x) | = s [ (fx-)- flo) K (a 


— [flix — t) — f(x) |K,(Hdt 


~ |flx— 1) —f(x)|Kn(Qdt + x [fx — 2) —f(x)|K,(Odt 


27 Nin<é S<|tl<x 
€ 

tec K,(t)dt+ “Us | edt <€ + 2€||f]|,o- 
2m |t|<d O<|t|<7 


Since € is arbitrary, the proof is complete. 


Observe that Feijer’s theorem furnishes another proof that trigonometric polyno- 
mials are uniformly dense in C(S'). 


Fourier Series of £?-Functions 


Consider the Banach space 2?(—7:, 77) with the norm 


1/p 
i= (35 [ earde) “.1sp<e 


We are primarily interested in the cases p = 1 and p = 2, but a good number of 
the results in this subsection are valid for any p € [1, 00). One can see directly that 
the Fourier coefficients fin) — — fe fe" dt of a function f in 2?(—7, 72) are 
defined. This is because, for p > 1, 2?(—71, 77) C 2'(—7, 72); hence 


fens =f (Rolar= lif <e. 


It is convenient to refer to the set of Fourier coefficients (An) nez of a function 
fe V(-7,7) by the notation ¥(f). We think of % as a linear transformation 
from 2?(—7,7) to some suitable range space. For example, when p = 2, the 
range space of & is ?(Z). We will show in example 2 below that, for all p> 1, 
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and all fe R?°(—7, 7), HG) € co(Z). The norm on cy(Z) is the co-norm. Thus 
FDloo = supti(n)| : n € Z}. 


The following theorem is a special case of example 3 in section 8.7. 
Theorem 8.9.3. Trigonometric polynomials are dense in 2?(—7,, 71). 


We are now ready to extend the discussion of Fourier series we started in section 
4.10 from C(S!) to 87(—7, 7). 


Theorem 8.9.4. The set {u,(t)=e'"':n€Z} is an orthonormal basis for 
2(—7,7). 


Proof. By theorem 8.9.3, Span({u, :n€ Z}) is dense in 2°(—7, 70). The assertion 
of this theorem follows directly from theorem 7.2.7. 


All the results we obtained in section 4.10 for Fourier series of continuous 
functions extend to 2?(—7, 1). The following theorem lists some of the properties. 
They follow directly from general Hilbert space theory. 


Theorem 8.9.5. The following are true for a function f € 2°(—7,7): 


(a) The sequence (Kn), > belongs to ?(Z). 

(b) Ifa =(a,,) € P(Z), then the series ye a,U, converges in 2°(—7, 71). 
(c) f(x) = Lego Knye™, where convergence takes place in R?(—7, 7). 

(d) |b =D . an) P. 

(e) If f(n) =0 for all n € Z, then f=0 ae. 

(f) The mapping f > &(f) is a linear isometry from 2°(—7, 71) onto ?(Z). 


The simplicity, elegance, and completeness of theorem 8.9.5 does not extend 

to functions in 2'(—7, 7). For example, the sequence of partial sums S,f(x) = 
nov - . : : 

> _,,Aj)u;(x) need not converge to fin the 1-norm (see the section exercises), and 

% does not map &!(—7,, 77) onto its range space, which we now describe. 


Example 2 (the Riemann-Lebesgue lemma). For f€ 2!(—7, 7), Rf) is in co(Z). 


Observe that the assertion holds for trigonometric polynomials. Indeed, if 
p®= Die ei, then p(n) = 0 whenever |n| > N. 

To prove the general case, let f € 2'(—7, 77), and let € > 0. Choose a trigono- 
metric polynomial p such that || f—p||, < €, and an integer N such that p(n) = 0 


for |n| > N. Now if |n| > N, 
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lAn)| — (An) — p(n)| = =| / (Ke) — p(t))e~ dt 
= =| If) — pdt = |Lf-plli <e.@ 


Example 3. If we view g,, as a linear operator on 2'(—7,, 72), ,, is bounded, and 
\|o,,|| < 1. For fe B'(—7, 7), 


iT 7 7 
dx < a | At)|K,(x— dtd 


lonflla = =| zl / SOK, —bdt 


1 


= ge | ol foodie = iq [| 1ROlae= Ih. 


Example 4. For f € 2'(—7,7:), o,f converges to fin 2'(—7, 7). 
Let € > 0, and choose g€ C(S') such that || f— gl, <¢. By Feijer’s theorem, 
0,g converges to g uniformly; hence a,,g converges to g in &'(—7,, 71). Choose 
an integer N such that, for n>N, |lo,g—gl|; <¢. Using example 3, if n> 
N, then |louf— fly $ llouf— onglls + llong — alli + llg—fllr < 2Ilf— ll: + llong— 
gil: <3e. 


Theorem 8.9.6 (the uniqueness theorem). If f€ Q!(—z,7) and f(n) = 0 for all 
n€Z, then f=0 a.e. Consequently, the mapping & : 2'\(—1,7) > co(Z) is 


injective. 


Proof. By assumption, o,f = 0 for alln € N. Since o,f converges to f by the previous 
example, ||f||, =0, and f= 0. a.e. Hl 


Theorem 8.9.7. The linear mapping & : &'(—2,7) > co(Z) is not onto. 

Proof. First observe that § is bounded by virtue of the inequality |f{(n)| < ||fll:. FS 
is surjective, then, by the open mapping theorem, % would be invertible; hence, 
for every fE 2'(—7z,7), |Iflla < MIIFPlloo, where M = ||F~"||. Now, for the 


sequence (D,,) of Dirichlet kernels, ||%(D,:)|loo = 1, while ||D,,||, co asin > oo. 
This contradiction delivers the result. @ 


The Fourier Transform 


The Fourier transform of a function f € 2'(R) is, by definition, the function 


f= i [ fle **dt. 
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ae 1 oat 
The normalization constant Fis included only for the symmetry of the formulas 
\ 270 


and is not essential. 

One can think of the Fourier transform as the continuous equivalent of Fourier 
series. Instead of using the discrete set of frequencies {e''},<7, the Fourier 
transform uses a continuum of frequencies {e"}.cp. 


It is clear that f€ 2°(R) and that ||fl],o < || fll;. The following theorem narrows 
down the range space of the Fourier transform. 


Theorem 8.9.8. If f€ 2'(R), then f € C,)(R). 
Proof. First we prove that f is continuous. Suppose (x,) is a convergent sequence 


and that lim,,x,, = Xo. Since |f(t)e*""| = |f()| and fe 2'(R), the dominated 
convergence theorem implies that 


lim, x,) = lim, ic ahead = i Faphine ttt =f). 


R 


To prove that f vanishes at oo, write 
Cee — Fe [ poeornae= === Fee [once tas (a=7/x). 
Thus 


x) = —ixg 1 —ixg 
(x) = al feds — a (nea 


- a 


Es —ixg 
a if (f— ep(GenEal. 


5] 


It follows that 
ieee = ~éxge = | _p_ra,. 
2K < = [1K8 (N(eP la = af th 


As |x| > 00, a > 0, and lim) 4.00 ||f— Tafll, > 0, by theorem 8.7.9. 


In this theorem, the fact that Lim jx}|+o0 fx) = 0 is known as the Riemann-Lebesgue 
lemma. 
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The next goal is to prove the inversion formula. Guided by the inversion formula 
for a function f€ 2'(—7, 7) when &(f) € /'(Z) (see problem 1 at the end of this 
section), one can reasonably conjecture that if fand fare both in 21(R), then 


f= Te Snf(Dedt for almost every x € R. 


The proof bears some resemblance to that of Feijer’s theorem in that we will find 
a family of functions {Gg} such that limg 1) f* Gg converges to fin 2'(R). Before 
we construct the family G,, it may be useful to find an even function that is equal 
to its own Fourier transform. One such function exists. The proof of the following 
proposition is left as an exercise. 


Proposition 8.9.9. For the function G,(x) = Tx ei ,G,=G,.8 


Example 5. The inversion formula holds for the function Gj. 
Because G is even, 


G(x) = G(-x) = 0,0) = = [ G,(De*dt = z i. G(Dedt. 


Definition. The Gauss kernel For o > 0, define 
1 —x? 
Go(x) = exp 
> ov 2a 20? 
Observe that G,(x) = ~G,(x/ cg). 
The family {G, : o > 0} is an approximate identity in the sense that 
(a) G5(x) > 0 for all x € R andallo > 0, 
(b) Jp Go(x)dx = 1 for all o > 0, and 
(c) For every 6 > 0, limg jo Iros Go(x)dx = 0. 


Example 6. We prove property (c). For |x| > 6, we have 


Gg(x)dx = [> X ex, fz “Has 
i 7 ave h is a cor 20 zai 


Zexpf =} 0080 10.4 


= 7 , ae = 
re 8/0 2 620 
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Other examples of approximate identities include the bump kernel we studied in 
section 8.7. Indeed, for a fixed n > 0, limpyo Jj 


eH 6;,(x)dx = 0, because, for every 
A<; Siysn Snax = 0. 


Example 7. Let g:. R—IR_ be a bounded continuous function. Then 


lime yo dre gly)Gg (y)dy = g(0). 


Let € > 0, and choose 6 > 0 such that |g(y) — g(0)| < € for all y such that |y| < 6. 
Also choose dy > 0 such that Jos Go(y)dy < € whenever 0 < 0 < op: 


le()Go(y) — g(0)| = | [ ay)Go(y) — : s(0)Go(yay 
R R 
Ss [ le(v) — g(0)|Go(y)dy 
R 


= i ley) — 9(0)|Go(y)dy + [ Igy) — g(0)|Ga(y)dy 
lyl<o 


ly|z6 


a [ Go(y)dy + 2llelle i Goly)dy <e(1 +2llellee). 
lyl<6 ly|>o 


Proposition 8.9.10. If 1 < p < oo and f € 2?(R), then limg 19 || f* Go — fll, = 0. 


Proof. Replicating the estimates in the proof of theorem 8.7.12, we obtain 
Ife Go—fih < [ lisF—IRGody 
R 


The function g(y) = IIz.f—Allp is continuous by problem 4 on section 8.7 and is 
bounded because |g(9)| < (yllp + Ilflp)” = "IL Ml. 


Applying the previous example, we have limg 0 Sg II5.f—flpGo dy = 
(0) = 0.8 


Example 8. We will later need the identity Go(x) = G (ote dt: 


1 
Fade 


= = G,(x/0) = ~G(-x/0) = iz if Gi(y) expliy=)dy 


oO 


G, (ote dt. @ 


“Tah 
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Theorem 8.9.11 (the inversion theorem). If f and f are in 2\(R), then for almost 
everyx ER, 


f= i [ Rdet*at. 


Proof. Using the previous example, we have 


fe Gots) = ffx DGoCoar= | fx z= [a (oy)e"dydt 
R R 


1 


= Fel [e- the” dtG,(ay)dy 
7 JRYR 
- iz ih i: Aude duG,(oy)dy 
7 JRYR 
= [ Jore"G,(oyray, 
R 


The summary of the above calculations is that 


f* G(x) = [ionrercvonay 
R 


Now consider the sequence o,, = 1/n. By the identity we just established, 
f* Go (x) = [sore Gouna (8) 
R 


On the one hand, Ae*”G,(a,y)| < [F)|/V 22, f € 21(R), and lim, G,(a,x) = 
G,(0) = —. Thus, by the dominated convergence theorem, the right side of 


identity S, converges to Te Sofie’ dy for everyxER. 


On the other hand, by proposition 8.9.10, the left side of identity (8) converges to f 
in 2'(R). By example 5 in section 8.3, the sequence f * Gz, contains a subsequence 
that converges a.e. to f. 

Putting the last two facts together, we arrive at the inversion formula. @ 


Observe that the function g(x) = ae Safe“ dt is in C)(R) by an argument 


identical to that in the proof of theorem 8.9.8. Thus the assumptions of the above 
theorem imply the fis equal a.e. to a C)(R) function. 
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Corollary 8.9.12 (the uniqueness theorem). [ff € 2'(R) and f = 0, then f(x) = 0 
forae.xER Ef 


Orthogonal Polynomials: One More Time 


In section 4.10, we studied the space H of continuous, square integrable functions 
with respect to a weight function w. The inner product we used was given by the 
formula 


(2) = Sf (x)e(x)eo(x)dx. 


We are now ready to settle a question that could not be answered completely in 
chapter 4. What is the smallest Hilbert space that contains the space H? The answer 
is now within our reach. 


If we define a finite positive measure fu on the g-algebra £ of Lebesgue measurable 
sets by du = wdA, where A is Lebesgue measure on R, then the Hilbert space we 
seek is (or should be, if there is justice) 2°(R,L, W) = 27 (u). 


In the case of Legendre polynomials, the situation is simple. Since w(x) = 1, 
the measure y is nothing other than the Lebesgue measure on (—1,1), and 
2 (u) = 2?(-1, 1). As we observed in section 4.10, the space H contains the space 
C[-1, 1]. By theorem 4.10.8, the linear span of the sequence-normalized Legendre 
polynomials (P,,) is dense in the space C[—1,1]. By example 1 in section 8.7, 
@[-1,1] is dense in 27(—1, 1). It follows that the linear span of (P,,) is dense in 
2?(—1, 1), and we have proved the following result. 


Theorem 8.9.13. The normalized Legendre polynomials P,, form an orthonormal 
basis for 2?(—1, 1). 


The situation is far less obvious in the case of Hermite polynomials. In this 
case, du=e-* dd, and it is true that the normalized Hermite polynomials 


He= a a (see problem 15 on section 4.10) form an orthonormal basis for 
nl2"/ 70 


2(). Equivalently, we prove the following. 


Theorem 8.9.14. Iff € 2°(u) and Jf, f(x)H,(x)du = 0 for alln EN, then f(x) = 0 
forae.xER. 


Proof. Since Span({Hp,...,H,}) = Span({1,x,,...,Xn}), the assumption is equivalent 
to Sp f(xte dx = 0 foralln EN. 
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Because 4 is a finite measure, 27(1) C B(x). In particular, f € B(x), and the 
function g(t) = fit)e® € Q'(A). The proof will be complete if we show that & = 0. 


We leave it to the reader to verify that, for a fixed x € R, the function h(t) = el € 


2?(u). It follows that the product fit)h(t) € 2'(w); hence fit)ele-* € Q(A). 
Now 


278(x) = [sor*e “warm f goet ixt)” it 
z y — [rore*a= 0. 
n=0 : R 


The term-by-term integration of the series is justified by the dominated conver- 
(—ixt)i 


j! 


is dominated 


7 ge 
gence theorem because the sequence of functions f(t)e~‘ ove 


by the integrable function f(t)ee-". 


Exercises 


1. Prove that if fe 2'(—z,7) and )™ |f(n)| < 00, then f(x) = DY Anei* 
a.e. In particular, f is equal a.e. to a continuous, 27r-periodic function. See 
theorem 4.10.7. 

2. Prove that there exists a function fe 2!'(—7,7) such that S,f does not 
converge to fin the 1-norm. Hint: View & as a bounded linear operator 
on 2!(—7z,7:), and use the Banach-Steinhaus theorem. 


3. Prove that the family hy(x) = a A > 0, is an approximate identity. 
TUX 


4. Show that the mapping  : B'(R) > C,(R) given by f+ fis bounded and 
1 
that ||ll = F5- 

5. Suppose fe 2'(R), and let a € R. Prove that 

(a) if g(x) = f(x)e, then 8(x) = f(x — a); 

(b) if g(x) = f(x—a), then (x) = f(x)e7; 

(c) if g(x) = f(—x), then a(x) = f(x); and 

(d) if g(x) = f(x/a) and a > 0, then 8(x) = af(ax). 

6. Show that if fg € 2'(R), then fx g € 21(R), and (F« g) =f. 

7. Prove that if f€ 2'(R) and the function g(x) = —ixf(x) € 2'(R), then f is 
differentiable and — A = (x). Hint: Use the definition of derivative and the 
dominated coterie theorem. 

8. Prove proposition 8. ‘s 9. Hint: Apply the previous exercises to derive the 

ne y= —xG,(x). 


differential equation —— 


11. 
12. 
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. Let g: R > R be a bounded continuous function. Prove that, for every 


x ER, limgso(g * Go)(x) = g(x). This result generalizes example 7. 


. Prove that there does not exist a function 6 € 2(IR) such that 6 « f= f for 


all fe 2'(R). 
Verify the details of the proof of theorem 8.9.14. 
1 


4/ nlan[7r 


Prove that the sequence ( H,(xe—*! *) is an orthonormal basis for 


L(A). 


APPENDIX A 
The Equivalence of Zorns Lemma, 
the Axiom of Choice, and the Well 
Ordering Principle 


Before we embark on the task of proving theorem 2.2.1, we need to develop some back- 
ground work. 


Notation. If S is a subset of a well-ordered set A, we use the notation min{S} to denote the 
least element of S. 


Definition. Let A bea well-ordered set, and let x € A. The initial segment of A determined 
by x is the set 
S(A,x) ={yEA i y< x}. 


Observe that x = min{A — S(A,x)}. 


Definition. Let (A,,<,) and (A;,<,) be well-ordered subsets of a nonempty set X. No 
ordering of X is assumed. We say that (A, <,) is a continuation of (A,,<,) if A; G A), 
A, is a segment of A, and <, agrees with <, on A). Simply stated, A, is a segment of A), 
and <, is the ordering A, inherits from (Aj, <,). We use the notation (A;, <,) € (Az, <)) 
to indicate that (A,,<,) is a continuation of (A,, <,) or that (Aj, <,) = (Ay, <)). 


A little reflection reveals that C is a partial ordering of the collection %8 of well-ordered 
subsets of X. 


Lemma A.1. In the notation of the above paragraph, let © = {(Ag,<q)}qe; be a chain in B, 
and let A= UgAg. Then A is well ordered. 


Proof. Recall that to say that © is a chain means that, fora, B € L either (Ag, <q) (Ag, Xs) 
or (Ag, Sg) C (Ag, Sq). Here is an explicit definition of the ordering < on A: for a,b € A, 
let a,B € Ibe such thata € Ag, bE Ag. Since © is a chain, say, (Ag, <q) © (Ag, Sg). Then 
a,b € Ag. Define a < b if a <@ b. The fact that < is well defined follows from the fact that 
© is a chain. It is a simple exercise to show that < linearly orders A. We now show that 
< is a well ordering on A. Let S be a nonempty subset of A. ThenSNAg # @ for some a. 
Let a be the least element of SN Ag. We claim that a is the least element of S. Let b € S be 
such that b < a, and assume that b € Ag. Tf (Ag, Sg) C (Ag. Sq), then a,b ESN Ag and 
b=a, since a is least in SN Ag. If (Ag, <q) © (Ag, Xg) then Ag is a segment of Ag, and 
b < a; hence, b € Ag, and, as before, b =a. 
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Note that (A, <) in the above lemma is an upper bound of the chain ©. Thus if Ay # A, 
then (A, <) is a continuation of (Ag, <,). The reader is encouraged to work out the details. 
The crucial step to verify is the following: if a€ Ay,y € A, and y <a, then y € Ag. See 
lemma A.4. 


Theorem A.2. Zorn’s lemma implies the well ordering principle. 


Proof. Let X be a nonempty set. We show that X can be well ordered. Let %8 be the collection 
of well-ordered subsets of X, and partially order %8 by continuation. By lemma A.1, a chain 
in %8 has an upper bound. By Zorn’s lemma, %8 has a maximal member (A, <). We claim 
that A = X. If A # X, pick an element z in X — A, and define an ordering <p on Z = AU {z} 
as follows: retain the ordering < on A, and define a <y z for alla € A. Now (Z, Xo) is a strict 
continuation of (A, <), which contradicts that maximality of (A, <). 


Theorem A.3. The well ordering principle implies the axiom of choice. 


Proof. Let {Xq} be a nonempty collection of nonempty sets. By assumption, each Xq can be well 
ordered. Let xq be the least element of Xy, and let x = (xq). Clearly, x is a choice function 
and ||,,X.#@. 


We need a final set of details before we prove the last leg of theorem 2.2.1. The definition 
below makes sense for linearly ordered sets, but we limit the discussion to well-ordered sets 
because this is where our interest lies now. 


Definition. A subset B of a well-ordered set A is said to be a section of A if the conditions 
a€A,b € Banda < b imply that a € B. 


The following facts are obvious: 


(a) A segment of A is a section of A. 
(b) A isa section of itself. 


The lemma below is crucial. It is a partial converse of fact (a). 
Lemma A.4. Every proper section B of a well-ordered set A is a segment of A. 
Proof. By assumption, A — B# ©. Let x = min{A — B}. We show that B= S(A,x). If y € 
S(A,x), then y < x; hence y € B because otherwise y would contradict the definition of x. 
Conversely, suppose y € B. Now y # x since y € Band x ¢ B. Also if y > x, then, by the 
definition of a section, x € B, which is a contradiction. Thus y < x andy € S(A,x). 


We adopt the following assumptions and terminology for the remainder of this appendix. 


Let (X, <) be a partially ordered set such that every chain in X has an upper bound but X 
has no maximal element. 
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Given a proper chain A in X, A has an upper bound, uw. Since u is not maximal in X, there 
is x € X such that x > u. Clearly, x ¢ A because x is a strict upper bound of A. Let 2 be 
the collection of all chains in X. Invoking the axiom of choice, we can choose a strict upper 
bound of each chain in X. Thus we have a function f : 2{ > X that assigns to each chain A 
a strict upper bound, f(A). 


Fix an element xy € X, and define a subset A of X to be conforming if 


(a) (A, <) is well ordered, 
(b) xp is the least element in A, and 
(c) For every x € A, f(S(A,x)) = x. 


Lemma A.5. [fA and B are conforming subsets of X and A — B # @, then B is a segment of A. 


Proof. Let x = min{A — B}, and define C = S(A,x). It is easy to verify that C C B. We claim 
that C = B. Suppose, for a contradiction, that B— C # ©, and let y = min{B — C}. We need 
three steps before we finalize the proof: 


1. S(B,y) is a proper subset of C: Suppose u€ Byu<y. If u€¢ C, then ue B—C, and 
u <y. This contradicts the definition of y. If S(B, y) = C = S(A,x), then y = f(S(B,y)) = 
f(S(A,x)) =% which is a contradiction because y € B and x ¢ B. This proves our 
assertion that S(B,y) is a proper subset of C. 

2. S(B, y) is a section of C: Ifu € C, v € S(B,y), and u < v, we show that u € S(B, y). Since 
u<xv<y,u<y.Ifu¢ S(B,y), then u ¢ B. Thus u € A—B; hence u>x. Butue C= 
S(A, x); hence u < x. This contradiction proves that u € S(B, y); hence S(B, y) is a section 
of C. 

3. S(B,y) is a segment of C; thus S(B,y) = S(C,z), where z € C: This follows directly from 
steps 1 and 2 and lemma A.4. 


Now we conclude the proof. By step 3, y = f(S(B,y)) = f(S(C, z)) = z. This is a contradiction 
because z € C, but y ¢ C by the definition of y. This contradiction proves that B = C. 


Let U be the union of all the conforming subsets of X. The following is a direct result of 
lemma A.5. 
Observation. If A is a conforming subset of X,a € A, y € Uand y <a, theny € A. 


Lemma A.6. U is a conforming subset of X. Thus U is the largest conforming subset of X. 


Proof. It is clear that xg is the least element of U. 
The following facts follow directly from the above observation: 


(a) IfT G Uand A is a conforming subset that intersects T, then the least element of TVA 
is also the least element of T. Thus U is well-ordered. 


(b) Ifx € U, and A is a conforming subset that contains x, then S(U,x) = S(A,x). 
Thus f(S(U,x)) = f(S(A,x)) = x. 
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Theorems A.2 and A.3 together with theorem A.7 below constitute the proof of 
theorem 2.2.1. 


Theorem A.7. The axiom of choice implies Zorn’s lemma. 


Proof. Let (X,<) be a partially ordered set such that each chain in X has an upper bound. If X 
is a chain, then it would have a maximal (in fact, a largest) element, and there is nothing 
more to prove. Therefore, we assume that X is not a chain. We show that X has a maximal 
element. Suppose, for a contradiction, that X contains no maximal element. 

We have shown in lemma A.6 that the set U is the largest conforming subset of X. Since U 
is well ordered and X is not, U# X. Let w = f(U). The set UU {a} is clearly a conforming 
subset of X that strictly contains U. This contradiction establishes the theorem. @ 
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Matrix Factorizations 


The main purpose of this appendix is to prove a useful matrix factorization result: theorem 


B.4. Theorem B.3 is a useful by-product of this appendix. 


Definition. Let A be an m Xn matrix. By an elementary row (column) operation on A, we 
mean one of the following operations: 


(a) multiplying one row (column) by a nonzero scalar s 
(b) interchanging two rows (columns) 


(c) adding a multiple (2) of one row (column) to another row (column) 


Definition. An elementary matrix is an matrix obtained by performing a single elementary 
row (or column) operation on the identity matrix. 


Thus there are three types of elementary matrices: 


(a) a scaling matrix (the entry s # 0 is the jth diagonal entry): 


S(s,i) = s 


1 


(b) an elementary permutation matrix (the off diagonal entries are (i,j) and (j,i)): 


PG, j) = 
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(c) a multiplier matrix (the entry p is the (j,i) entry): 


1 


M(H, i,j) = 


Elementary matrices are invertible since they have nonzero determinants: 
det (S) = s,det(P) = —1, and det(M) = 1. 


The inverse of an elementary matrix is an elementary matrix of the same type. Clearly, a 
permutation matrix is equal to its own inverse. For a scaling matrix, S(s, i)! = S(1/s, i). 


Observe that a multiplier matrix can be written as M(u, i,j) = I+ Meje;. Using this, it is easy 


to verify that (I+ bee) == Meje;. Here J is the identity matrix of the appropriate size. 


Lemma B.1. Let A be an m Xn matrix. 


(a) If Eisanm xm elementary matrix obtained by performing a certain elementary row 
operation on I,,, then performing the same operation on A produces the matrix EA. 


(b) If Fisann Xn elementary matrix obtained by performing a certain elementary column 
operation on I, then performing the same operation on A produces the matrix AF. 


Proof. Verifying the theorem when E or F is a scaling matrix or a permutation matrix is trivial. 
If F is obtained from I,, by adding fu times column j to column i, then AF = AU, + Meje/) = 
A+ u(Ae;)e!. Now Ae; is the j" column of A, and (Aeje? is a matrix whose only nonzero 


column is the j* column of A placed in the i'* column. Hence the result. 


Proving part (a) for the case of left multiplication by a multiplier matrix is similar and is 
left to the reader. 


Theorem B.2. Given a nonzero mxXn matrix A, there exist elementary mxXm matrices 
E,,E,...,E, and elementary n Xn matrices F,,F 5, ...,F, such that 
E,...E,E,AF,F,...F, =D, where D is a diagonal matrix of the form 


d, 


,d,#0,1<i<q. 
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Proof. In light of lemma B.1, it is enough to prove that A can be reduced to a diagonal matrix 
through a sequence of elementary row and column operations. We proceed by induction on 
the number of rows, m. The result is true for a 1 Xn matrix and for any n € N. Consider 
the 1Xn matrix A =(a),...,a,). If a; =0, we interchange the first entry with a later, 
nonzero entry. Once that is achieved, we subtract a,/a, times entry 1 from entryj,2 <j <n. 
We obtain a matrix of the form (a,,0,0,...,0). This proves the base case of our inductive 
proof. Now we show the inductive step. Suppose the conclusion of the theorem holds for 
kxXn matrices ifk <mandn EN. Let A be anmXn matrix. If a,,, =0, we can move a 
nonzero entry from a later row and/or column to the top left entry, so assume that a, #0. 
Subtracting a;,,/a,,, times the top row from row i,2 < i < m, then subtracting a, ;/a,,, times 
the first column from column j,2 <j <n, we obtain a matrix of the form 


Bit. Oca an 0 
0 


4 (*) 


Applying the inductive hypothesis to the sub-matrix A’, we can reduce A’ to a diagonal 
matrix through a combination of elementary row and column operations. Notice that 
operating on A’ does not perturb the top row or the first column of the matrix (*). 


Theorem B.3. Given an m Xn matrix A, there exists an invertible m Xm matrix Q and an 
invertible n X n matrix P such that Q~'AP is diagonal. 


Proof. Use the previous theorem and take Q = (E,...E,)~' and P= F,...F,. @ 
Theorem B.4. An invertible square matrix is the product of elementary matrices. 
Proof. Using theorem B.2, if A is invertible, so is D (recall that elementary matrices are 


invertible). In this case, still in the notation of theorem B.2, q = n and D=S,S,...S,, where 
S; is the scaling matrix 


Thus A Sly le SiS ke seks s as desired. 
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